Kreuzcrawl.ContentConfig (kreuzcrawl v0.3.0-rc.37)

Copy Markdown

Content extraction and conversion configuration.

Controls how HTML is converted to the output format. Uses html-to-markdown-rs as the conversion engine for all formats (markdown, plain text, djot).

Summary

Types

t()

Content extraction and conversion configuration.

Types

t()

@type t() :: %Kreuzcrawl.ContentConfig{
  exclude_selectors: [String.t()],
  include_document_structure: boolean(),
  max_depth: non_neg_integer() | nil,
  output_format: String.t() | nil,
  preprocessing_preset: String.t() | nil,
  preserve_tags: [String.t()],
  remove_forms: boolean(),
  remove_navigation: boolean(),
  skip_images: boolean(),
  strip_tags: [String.t()],
  wrap: boolean(),
  wrap_width: non_neg_integer()
}

Content extraction and conversion configuration.