link_preview_generator v0.0.3 LinkPreviewGenerator.Parsers.Html
Parser implementation based on html tags.
Summary
Functions
Get page description based on first encountered h1..h6 tag
Get images based on img tags
Get page title based on first encountered title tag
Functions
Specs
description(LinkPreviewGenerator.Page.t, Floki.html_tree) :: LinkPreviewGenerator.Page.t
Get page description based on first encountered h1..h6 tag.
Preference: h1> h2 > h3 > h4 > h5 > h6
Config options:
:friendly_strings
- remove leading and trailing whitespaces, change rest of newline characters to space and replace all multiple spaces by single space; default: true
Specs
images(LinkPreviewGenerator.Page.t, Floki.html_tree) :: LinkPreviewGenerator.Page.t
Get images based on img tags.
Config options:
:force_images_absolute_url
- try to add website url fromLinkPreviewGenerator.Page
struct to all relative urls, then remove remaining relative urls from list; default: false:force_images_url_schema
- try to add http:// to urls without schema, then remove all invalid urls; default: false:filter_small_images
- if set to true it filters images with at least one dimension smaller than 100px; if set to integer value it filters images with at least one dimension smaller than that integer; requires imagemagick to be installed on machine; default: false;
WARNING: Using these options may reduce performance. To prevent very long processing time images limited to first 50 by design.
Specs
title(LinkPreviewGenerator.Page.t, Floki.html_tree) :: LinkPreviewGenerator.Page.t
Get page title based on first encountered title tag.
Config options:
:friendly_strings
- remove leading and trailing whitespaces, change rest of newline characters to space and replace all multiple spaces by single space; default: true