link_preview_generator v0.0.2 LinkPreviewGenerator.Parsers.Html

Parser implementation based on html tags.

Summary

Functions

Get page description based on first encountered h1..h6 tag

Get images based on img tags

Get page title based on first encountered title tag

Functions

description(page, parsed_body)

Specs

description(LinkPreviewGenerator.Page.t, Floki.html_tree) :: LinkPreviewGenerator.Page.t

Get page description based on first encountered h1..h6 tag.

Preference: h1> h2 > h3 > h4 > h5 > h6

Config options:

  • :friendly_strings - remove leading and trailing whitespaces, change rest of newline characters to space and replace all multiple spaces by single space; default: true
images(page, parsed_body)

Specs

Get images based on img tags.

Config options:

  • :force_images_absolute_url - try to add website url from LinkPreviewGenerator.Page struct to all relative urls, then remove remaining relative urls from list; default: false
  • :force_images_url_schema - try to add http:// to urls without schema, then remove all invalid urls; default: false
  • :filter_small_images - if set to true it filters images with at least one dimension smaller than 100px; if set to integer value it filters images with at least one dimension smaller than that integer; default: false;
title(page, parsed_body)

Specs

Get page title based on first encountered title tag.

Config options:

  • :friendly_strings - remove leading and trailing whitespaces, change rest of newline characters to space and replace all multiple spaces by single space; default: true