View Source z_html (zotonic_stdlib v1.17.0)

Utility functions for html processing. Also used for property filtering (by m_rsc_update).

Link to this section Summary

Functions

Make all links (href/src) in the html absolute to the base URL This takes a shortcut by checking all ' (src|href)=".."'
Translate any html br entities to newlines.
Ensure that &-characters are properly escaped inside a html string.
Escape a string so that it is valid within HTML/ XML.
Ensure that a string is escaped so that it is valid within HTML/ XML.
Escape smaller-than, greater-than (for in comments)
Escape smaller-than, greater-than, single and double quotes in texts (& is already removed or escaped).
Escape a text. Expands any urls to links with a nofollow attribute.
Escape all properties used for an update statement. Only leaves the body property intact.
Checks if all properties are properly escaped
Flatten an attribute, attributes have been whitelisted and the values have been sanitized.
Translate any newlines to html br entities.
Filter a url, remove any "javascript:" and "data:" (as data can be text/html).
Filter an url, if strict then also remove "data:" (as data can be text/html).
Sanitize a (X)HTML string. Remove elements and attributes that might be harmful.
Sanitize a mochiwebparse tree. Remove harmful elements and attributes.
Ensure that an uri is (quite) harmless by removing any script reference
Given a HTML list, scrape all <link> elements and return their attributes. Attribute names are lowercased.
Strip all html elements from the text. Simple parsing is applied to find the elements. Does not escape the end result.
Strip all html elements from the text. Simple parsing is applied to find the elements. Does not escape the end result. Limit the length of the result string to N characters.
Truncate a previously sanitized HTML string.
Unescape - reverses the effect of escape.

Link to this section Types

-type maybe_binary() :: undefined | binary().
-type maybe_iodata() :: undefined | iodata().
-type maybe_text() :: undefined | text().
-type maybe_unsafe_text() :: undefined | unsafe_text().
-type sanitize_option() :: {elt_extra, [binary()]} | {attr_extra, [binary()]} | {element, function()}.
-type sanitize_options() :: [sanitize_option()].
-type text() :: iodata() | {trans, [{atom(), binary()}]}.
-type unsafe_text() ::
    iodata() | {trans, [{atom(), iodata()}]} | {trans, [{binary(), iodata()}]} | {trans, map()}.

Link to this section Functions

-spec abs_links(maybe_iodata(), binary()) -> iodata().
Make all links (href/src) in the html absolute to the base URL This takes a shortcut by checking all ' (src|href)=".."'
-spec br2nl(maybe_text()) -> maybe_text().
Translate any html br entities to newlines.
-spec ensure_escaped_amp(maybe_binary()) -> binary().
Ensure that &-characters are properly escaped inside a html string.
-spec escape(maybe_unsafe_text()) -> maybe_text().
Escape a string so that it is valid within HTML/ XML.
-spec escape_check(maybe_unsafe_text()) -> maybe_text().
Ensure that a string is escaped so that it is valid within HTML/ XML.
Link to this function

escape_html_comment(_, Acc)

View Source
Escape smaller-than, greater-than (for in comments)
Link to this function

escape_html_text(_, Acc)

View Source
Escape smaller-than, greater-than, single and double quotes in texts (& is already removed or escaped).
-spec escape_link(maybe_iodata()) -> maybe_binary().
Escape a text. Expands any urls to links with a nofollow attribute.
-spec escape_props(list() | map()) -> list() | map().
Escape all properties used for an update statement. Only leaves the body property intact.
Link to this function

escape_props(Props, Options)

View Source
-spec escape_props(list() | map(), Options :: list()) -> list() | map().
Link to this function

escape_props_check(Props)

View Source
-spec escape_props_check(list() | map()) -> list() | map().
Checks if all properties are properly escaped
Link to this function

escape_props_check(Props, Options)

View Source
-spec escape_props_check(list() | map(), Options :: list()) -> list() | map().
Flatten an attribute, attributes have been whitelisted and the values have been sanitized.
-spec nl2br(maybe_text()) -> maybe_text().
Translate any newlines to html br entities.
-spec noscript(Url) -> SafeUrl when Url :: string() | binary(), SafeUrl :: binary().
Filter a url, remove any "javascript:" and "data:" (as data can be text/html).
-spec noscript(Url, IsStrict) -> SafeUrl
            when Url :: string() | binary(), IsStrict :: boolean(), SafeUrl :: binary().
Filter an url, if strict then also remove "data:" (as data can be text/html).
-spec sanitize(maybe_text()) -> maybe_text().
Sanitize a (X)HTML string. Remove elements and attributes that might be harmful.
-spec sanitize(maybe_text(), sanitize_options()) -> maybe_text().
Link to this function

sanitize(ParseTree, ExtraElts, ExtraAttrs, Options)

View Source
-spec sanitize(z_html_parse:html_element(), binary() | list(), binary() | list(), any()) ->
            z_html_parse:html_element().
Sanitize a mochiwebparse tree. Remove harmful elements and attributes.
Link to this function

sanitize_attr_value(Attr, V)

View Source
-spec sanitize_uri(maybe_iodata()) -> maybe_binary().
Ensure that an uri is (quite) harmless by removing any script reference
-spec strip(maybe_text()) -> binary().
Strip all html elements from the text. Simple parsing is applied to find the elements. Does not escape the end result.
-spec strip(maybe_text(), integer() | nolimit) -> binary().
Strip all html elements from the text. Simple parsing is applied to find the elements. Does not escape the end result. Limit the length of the result string to N characters.
-spec truncate(maybe_text(), integer()) -> maybe_text().
Truncate a previously sanitized HTML string.
Link to this function

truncate(Html, Length, Append)

View Source
-spec truncate(maybe_text(), integer(), iodata()) -> maybe_text().
-spec unescape(maybe_text()) -> maybe_text().
Unescape - reverses the effect of escape.