API Reference kreuzcrawl v#0.3.0-rc.37

Copy Markdown

Modules

High-level API for kreuzcrawl

Result from a single page action execution.

Article metadata extracted from article:* Open Graph tags.

The category of a downloaded asset.

Authentication configuration.

Result from a single URL in a batch crawl operation.

Aggregate result of a batch crawl, exposing per-URL results plus precomputed counts.

Request to begin a multi-URL streaming crawl.

Result from a single URL in a batch scrape operation.

Aggregate result of a batch scrape, exposing per-URL results plus precomputed counts.

Browser backend used for JavaScript rendering.

Browser fallback configuration.

Browser-specific extras populated when the native browser backend was used.

When to use the headless browser fallback.

Wait strategy for browser page rendering.

A single numbered reference in a citation list — produced by the citation extractor when content uses inline [N]-style markers.

Result of citation conversion.

Content extraction and conversion configuration.

Information about an HTTP cookie received from a response.

Configuration for crawl, scrape, and map operations.

Opaque handle to a configured crawl engine.

An event emitted during a streaming crawl operation.

The result of crawling a single page during a crawl operation.

The result of a multi-page crawl operation.

Request to begin a single-URL streaming crawl.

A downloaded asset from a page.

A downloaded non-HTML document (PDF, DOCX, image, code file, etc.).

Metadata about an LLM extraction pass.

Information about a favicon or icon link.

Information about a feed link found on a page.

The type of a feed (RSS, Atom, or JSON Feed).

A heading element extracted from the page.

An hreflang alternate link entry.

Information about an image found on a page.

The source of an image reference.

Result of executing a sequence of page interaction actions.

A JSON-LD structured data entry found on a page.

Information about a link found on a page.

The classification of a link.

The result of a map operation, containing discovered URLs.

Rich markdown conversion result from HTML processing.

A single page interaction action.

Metadata extracted from an HTML page's <meta> tags and <title> element.

Proxy configuration for HTTP requests.

Response metadata extracted from HTTP headers.

The result of a single-page scrape operation.

Direction for a scroll action.

A URL entry from a sitemap.