CDPEx.Page (CDPEx v0.3.0)

Copy Markdown View Source

A page (tab) handle and the operations you run against it.

A CDPEx.Page is a lightweight struct — not a process — holding the page's CDPEx.Connection pid and target id. Operations are functions over that connection, so the OTP properties (supervision, crash isolation) live in the connection/browser layer while page calls stay ergonomic.

Obtain one with CDPEx.new_page/2. If the underlying page dies (navigation to a new target, a crash), operations return {:error, :noproc} and you should open a fresh page.

Operations

Summary

Functions

Returns attribute name of the first element matching css, or nil when the element or attribute is absent.

Arms HTTP/proxy authentication on this page with username/password.

Calls a JavaScript function with args and returns its value.

Clears all browser cookies. Lazily enables Network. Options: :timeout.

Clicks the first element matching css (a synthetic JS .click()).

Lets a paused request proceed (Fetch.continueRequest), optionally rewriting it.

Returns all browser cookies as a list of CDP cookie maps.

Disables request interception — unsubscribes the caller from Fetch.requestPaused and disables the Fetch domain. Resolve any still-paused requests first.

Enables request interception: pauses matching requests and delivers a Fetch.requestPaused event to the calling process for each one. You must then resolve every paused request with continue_request/3, fulfill_request/3, or fail_request/3 (keyed by its "requestId") — an unresolved request stalls the page.

Evaluates a JavaScript expression and returns its value (returnByValue).

Fails a paused request (Fetch.failRequest).

Answers a paused request with a synthetic response (Fetch.fulfillRequest) — the page never hits the network for it.

Returns the page's full serialized HTML (document.documentElement.outerHTML).

Navigates to url and (by default) waits until the network is almost idle.

Starts observing network traffic, delivering CDP Network events to the calling process.

Renders the page to PDF (Page.printToPDF).

Returns a response's body by its request_id (from a Network.responseReceived event), via Network.getResponseBody.

Captures a PNG screenshot.

Sets cookies. Each is a CDP CookieParam map — at least "name", "value", and a "url" or "domain". Lazily enables Network. Options: :timeout.

Sets extra HTTP headers sent with every subsequent request on this page.

Overrides the page's User-Agent (Emulation.setUserAgentOverride).

Overrides the viewport via Emulation.setDeviceMetricsOverride.

Stops observing — unsubscribes the caller from the network :events. Leaves the Network domain enabled.

Returns the textContent of the first element matching css, or nil when no element matches.

Returns {:ok, true} when the first element matching css is rendered and visible (has layout boxes, not display: none / visibility: hidden), {:ok, false} otherwise — including when no element matches.

Polls a JavaScript expression until it is truthy, or timeout elapses.

Waits for a navigation lifecycle milestone, without issuing a navigation.

Polls until css matches an element, or timeout elapses.

Types

t()

@type t() :: %CDPEx.Page{
  browser: pid(),
  conn: pid(),
  session_id: String.t() | nil,
  target_id: String.t()
}

Functions

attribute(page, css, name, opts \\ [])

@spec attribute(t(), String.t(), String.t(), keyword()) ::
  {:ok, String.t() | nil} | {:error, term()}

Returns attribute name of the first element matching css, or nil when the element or attribute is absent.

authenticate(page, username, password, opts \\ [])

@spec authenticate(t(), String.t(), String.t(), keyword()) :: :ok | {:error, term()}

Arms HTTP/proxy authentication on this page with username/password.

Headless Chrome launched with --proxy-server=host:port can't send proxy credentials, so an authenticated proxy rejects the connection (net::ERR_INVALID_AUTH_CREDENTIALS). Call this after new_page/2 and before navigate/3: it answers the proxy (or HTTP Basic) auth challenge with the given credentials. It also covers Basic-auth-gated origins.

This enables the CDP Fetch domain for the page, which pauses (and auto-continues) every request — measurable overhead on heavy pages.

Only :dedicated pages (the new_page/2 default) are supported; a :session page returns {:error, {:unsupported_transport, :session}}. A page that isn't one of this browser's open pages returns {:error, :unknown_page}, and a page that is already authenticated returns {:error, :already_authenticated}.

The bad-credentials loop guard keys on the request id, so a single request that must answer both a proxy and an origin challenge isn't supported — the second challenge is cancelled (Puppeteer-parity).

Options:

  • :source — which challenges to answer: :any (default), :proxy, :server. An unknown value returns {:error, {:invalid_source, value}}.

call_function(page, function_declaration, args \\ [], opts \\ [])

@spec call_function(t(), String.t(), [term()], keyword()) ::
  {:ok, term()} | {:error, term()}

Calls a JavaScript function with args and returns its value.

function_declaration is a JS function expression (e.g. "(a, b) => a + b"). args are JSON-encoded (not string-interpolated) before being applied, so passing data values through them is safe. A thrown exception is {:error, {:evaluate_exception, details}}; non-serializable args return {:error, {:invalid_args, reason}}.

Trusted input

function_declaration is interpolated into the page script verbatim — treat it as trusted code and never build it from untrusted input.

Options: :timeout (default 15_000), :await_promise (default false).

clear_cookies(page, opts \\ [])

@spec clear_cookies(
  t(),
  keyword()
) :: :ok | {:error, term()}

Clears all browser cookies. Lazily enables Network. Options: :timeout.

click(page, css, opts \\ [])

@spec click(t(), String.t(), keyword()) :: :ok | {:error, term()}

Clicks the first element matching css (a synthetic JS .click()).

Returns :ok, or {:error, {:selector_not_found, css}} when nothing matches.

continue_request(page, request_id, opts \\ [])

@spec continue_request(t(), String.t(), keyword()) :: :ok | {:error, term()}

Lets a paused request proceed (Fetch.continueRequest), optionally rewriting it.

:url, :method, and :headers are verbatim overrides, not merges. In particular :headers replaces the entire request header set, so passing it to set one header drops everything Chrome would otherwise send (User-Agent, Accept, Cookie, …). Omit :headers to leave the original request headers intact (the same gotcha as Puppeteer's continueRequest({headers})).

Options (all optional): :url, :method, :headers (a name => value map or keyword list), :post_data (a binary or iodata, base64-encoded for you), :timeout.

cookies(page, opts \\ [])

@spec cookies(
  t(),
  keyword()
) :: {:ok, [map()]} | {:error, term()}

Returns all browser cookies as a list of CDP cookie maps.

Lazily enables the Network domain. Options: :timeout (default 10_000).

disable_request_interception(page, opts \\ [])

@spec disable_request_interception(
  t(),
  keyword()
) :: :ok | {:error, term()}

Disables request interception — unsubscribes the caller from Fetch.requestPaused and disables the Fetch domain. Resolve any still-paused requests first.

Call this from the same process that called enable_request_interception/2: the unsubscribe is keyed to self(), so a disable from a different process leaves the original subscriber still receiving (now-unresolvable) pauses.

enable_request_interception(page, opts \\ [])

@spec enable_request_interception(
  t(),
  keyword()
) :: :ok | {:error, term()}

Enables request interception: pauses matching requests and delivers a Fetch.requestPaused event to the calling process for each one. You must then resolve every paused request with continue_request/3, fulfill_request/3, or fail_request/3 (keyed by its "requestId") — an unresolved request stalls the page.

Each pause arrives as {:cdp_event, conn, "Fetch.requestPaused", params, session_id}; handle it in a handle_info. The caller is subscribed before the domain is enabled, so no paused request is missed.

The caller owns the lifecycle

Nothing ties the enabled Fetch domain to the calling process. If the caller exits — or simply never calls disable_request_interception/2Fetch stays enabled with no resolver and every subsequent request pauses forever: the page stalls permanently and can't even navigate away. Unlike a leaked Network.enable, a leaked Fetch.enable bricks the page. Drive interception from a long-lived process you control, and use that same process for enable, the pause handling, and disable (the subscription is keyed to its pid).

On a :session-transport page the caller receives every session's Fetch.requestPaused events on the shared connection (subscriptions are keyed by method, not session); match on the session_id element to filter to this page.

Mutually exclusive with authenticate/4 on the same page — both drive the Fetch domain. The conflict is not enforced and fails silently: enabling interception on an authenticated page re-runs Fetch.enable without handleAuthRequests (breaking auth) while the auth handler keeps racing the caller for each pause. Use one or the other per page.

Options:

  • :patterns — CDP RequestPatterns (default [%{"urlPattern" => "*"}], all requests)
  • :timeout — ms for the enable call (default 10_000)

evaluate(page, js, opts \\ [])

@spec evaluate(t(), String.t(), keyword()) :: {:ok, term()} | {:error, term()}

Evaluates a JavaScript expression and returns its value (returnByValue).

A thrown JS exception is {:error, {:evaluate_exception, details}}.

Options: :timeout (default 15_000), :await_promise (default false).

fail_request(page, request_id, opts \\ [])

@spec fail_request(t(), String.t(), keyword()) :: :ok | {:error, term()}

Fails a paused request (Fetch.failRequest).

:reason (default :failed) is one of :failed, :aborted, :timed_out, :access_denied, :connection_closed, :connection_reset, :connection_refused, :name_not_resolved, :internet_disconnected, :address_unreachable, :blocked_by_client, :blocked_by_response. An unknown value returns {:error, {:invalid_error_reason, value}}.

fulfill_request(page, request_id, opts \\ [])

@spec fulfill_request(t(), String.t(), keyword()) :: :ok | {:error, term()}

Answers a paused request with a synthetic response (Fetch.fulfillRequest) — the page never hits the network for it.

Options: :status (response code, default 200), :headers (a name => value map or keyword list), :body (a binary or iodata, base64-encoded for you), :timeout.

html(page, opts \\ [])

@spec html(
  t(),
  keyword()
) :: {:ok, String.t()} | {:error, term()}

Returns the page's full serialized HTML (document.documentElement.outerHTML).

observe_network(page, opts \\ [])

@spec observe_network(
  t(),
  keyword()
) :: :ok | {:error, term()}

Starts observing network traffic, delivering CDP Network events to the calling process.

Subscribes the caller to :events (default the request + response lifecycle), then enables the Network domain (idempotent). Each event arrives as {:cdp_event, conn, method, params, session_id} — handle them in a handle_info. Call stop_observing_network/2 to unsubscribe.

Start observing before navigating: requests already in flight when you call this are not captured. On a session-transport page the caller receives every session's events on the shared connection (subscriptions are keyed by method, not session); match on the session_id element to filter to this page.

Options:

  • :eventsNetwork.* method names (default request + response lifecycle)
  • :timeout — ms for the enable call (default 10_000)

pdf(page, opts \\ [])

@spec pdf(
  t(),
  keyword()
) :: {:ok, binary()} | {:error, term()}

Renders the page to PDF (Page.printToPDF).

Returns {:ok, data} where data is the PDF bytes — or, when :path is given, the written file path (also a binary). Options: :path, :landscape (default false), :print_background (default true), :timeout (default 30_000).

response_body(page, request_id, opts \\ [])

@spec response_body(t(), String.t(), keyword()) :: {:ok, binary()} | {:error, term()}

Returns a response's body by its request_id (from a Network.responseReceived event), via Network.getResponseBody.

Returns {:ok, body} (decoding base64 when Chrome sends it that way) or {:error, reason}. The Network domain must have been enabled (e.g. via observe_network/2) when the request was captured — unlike the other Network ops this does not lazily enable it, since enabling now can't recover a past body. If it wasn't enabled, the call surfaces as {:error, {:cdp_error, "Network.getResponseBody", _}}. Options: :timeout (default 10_000).

screenshot(page, opts \\ [])

@spec screenshot(
  t(),
  keyword()
) :: {:ok, binary()} | {:error, term()}

Captures a PNG screenshot.

Returns {:ok, data} where data is the PNG bytes — or, when :path is given, the written file path (also a binary).

Options: :path (write to file), :full_page (capture beyond the viewport, default false), :timeout (default 30_000).

set_cookies(page, cookies, opts \\ [])

@spec set_cookies(t(), [map()], keyword()) :: :ok | {:error, term()}

Sets cookies. Each is a CDP CookieParam map — at least "name", "value", and a "url" or "domain". Lazily enables Network. Options: :timeout.

set_extra_headers(page, headers, opts \\ [])

@spec set_extra_headers(t(), %{optional(String.t()) => String.t()}, keyword()) ::
  :ok | {:error, term()}

Sets extra HTTP headers sent with every subsequent request on this page.

headers is a map of header name => value; set them before navigating for them to apply to that navigation. Lazily enables Network. Options: :timeout.

set_user_agent(page, user_agent, opts \\ [])

@spec set_user_agent(t(), String.t(), keyword()) :: :ok | {:error, term()}

Overrides the page's User-Agent (Emulation.setUserAgentOverride).

Options: :timeout (default 10_000).

set_viewport(page, width, height, opts \\ [])

@spec set_viewport(t(), pos_integer(), pos_integer(), keyword()) ::
  :ok | {:error, term()}

Overrides the viewport via Emulation.setDeviceMetricsOverride.

width/height are CSS pixels. Options: :device_scale_factor (default 1), :mobile (default false), :timeout. Returns :ok.

stop_observing_network(page, opts \\ [])

@spec stop_observing_network(
  t(),
  keyword()
) :: :ok

Stops observing — unsubscribes the caller from the network :events. Leaves the Network domain enabled.

Pass the same :events you gave observe_network/2. Both default to the request + response lifecycle, but if you observed with a custom list you must repeat it here — otherwise the original subscriptions are never removed and the caller keeps receiving those events.

text(page, css, opts \\ [])

@spec text(t(), String.t(), keyword()) :: {:ok, String.t() | nil} | {:error, term()}

Returns the textContent of the first element matching css, or nil when no element matches.

visible?(page, css, opts \\ [])

@spec visible?(t(), String.t(), keyword()) :: {:ok, boolean()} | {:error, term()}

Returns {:ok, true} when the first element matching css is rendered and visible (has layout boxes, not display: none / visibility: hidden), {:ok, false} otherwise — including when no element matches.

wait_for_function(page, js, opts \\ [])

@spec wait_for_function(t(), String.t(), keyword()) :: :ok | {:error, term()}

Polls a JavaScript expression until it is truthy, or timeout elapses.

The expression is coerced with !!(...), so JS truthiness applies. Returns :ok, {:error, :timeout}, or {:error, reason} if a non-transient evaluate error occurs (e.g. a thrown exception or a dropped connection). Options: :timeout (default 5_000), :interval (poll interval ms, default 100).

wait_for_navigation(page, opts \\ [])

@spec wait_for_navigation(
  t(),
  keyword()
) :: :ok | {:error, term()}

Waits for a navigation lifecycle milestone, without issuing a navigation.

Useful after a click/3 (or other in-page action) that triggers navigation.

Options:

  • :wait_until:network_almost_idle (default), :load, or :none
  • :timeout — ms (default 30_000)

Returns :ok, {:error, :timeout}, or {:error, reason} if the connection drops while waiting.

wait_for_selector(page, css, opts \\ [])

@spec wait_for_selector(t(), String.t(), keyword()) :: :ok | {:error, term()}

Polls until css matches an element, or timeout elapses.

Returns :ok, {:error, :timeout}, or {:error, reason} if a non-transient evaluate error occurs (e.g. the connection drops). Options: :timeout (default 5_000), :interval (poll interval ms, default 100).