Link Preview v1.0.1 LinkPreview.Parsers.Html

Parser implementation based on html tags.

Summary

Functions

Get page description based on first encountered h1..h6 tag

Get images based on img tags

Get page title based on first encountered title tag

Functions

description(page, body)

Get page description based on first encountered h1..h6 tag.

Preference: h1> h2 > h3 > h4 > h5 > h6

Config options:

  • :friendly_strings

    see LinkPreview.Parsers.Basic.maybe_friendly_string/1 function

    default: true

images(page, body)

Get images based on img tags.

Config options:

  • :force_images_absolute_url

    try to add website url from LinkPreview.Page struct to all relative urls then remove remaining relative urls from list

    default: false

  • :force_images_url_schema

    try to add http:// to urls without schema then remove all invalid urls

    default: false

  • :filter_small_images

    if set to true it filters images with at least one dimension smaller than 100px

    if set to integer value it filters images with at least one dimension smaller than that integer

    requires Mogrify and Tempfile optional packages and imagemagick to be installed on machine

    default: false

    WARNING: Using these options may reduce performance. To prevent very long processing time images limited to first 50 by design.

title(page, body)

Get page title based on first encountered title tag.

Config options:

  • :friendly_strings

    see LinkPreview.Parsers.Basic.maybe_friendly_string/1 function

    default: true