z_url_metadata (zotonic_stdlib v1.28.1)
View SourceDiscover metadata about an url. Follows redirects and URL shorteners, and then fetches the data at the final URL to inspect for metadata tags, content headers and the first part of the HTML.
The returned opaque metadata can be questioned for properties using p/2.
The Slackbot user-agent is used for fetching URLs so that the URL shorteners return a location header and other sites are coerced to give correct metadata.
Only the first MB of data is fetched, this prevents fetching large objects.
Summary
Functions
Fetch metadata information for the URL with default fetch options.
Fetch metadata information for the URL, with url fetch options. The data of the URL is fetched partially, with a default maximum length of 1MB. The returned metadata is extracted from the fetched data and http headers.
Parse metadata from the given headers and data, if an empty header list is given, then a header with content-type html is added.
Parse metadata from the given base/final URL, headers and data. If an empty header list is given, then a header with content-type html is added.
Fetch properties of the fetched metadata
Types
-type property() :: mime | mime_options | site_name | content_length | url | canonical_url | short_url | final_url | links | headers | title | h1 | summary | tags | filename | mtitle | description | keywords | author | charset | language | image | image_nav | thumbnail | icon | icon_nav | icon_shortcut | icon_touch | binary().
Functions
Fetch metadata information for the URL with default fetch options.
-spec fetch(binary() | string(), z_url_fetch:options()) -> {ok, metadata()} | {error, term()}.
Fetch metadata information for the URL, with url fetch options. The data of the URL is fetched partially, with a default maximum length of 1MB. The returned metadata is extracted from the fetched data and http headers.
Parse metadata from the given headers and data, if an empty header list is given, then a header with content-type html is added.
This compatibility variant has no source URL, so callers that need correct normalization of relative metadata values should use fetch_data/3 and pass the final/base URL of the fetched content.
-spec fetch_data(binary() | string(), Headers, Data) -> {ok, metadata()} when Headers :: list(), Data :: binary().
Parse metadata from the given base/final URL, headers and data. If an empty header list is given, then a header with content-type html is added.
-spec p(Property, Metadata) -> Value when Property :: property() | [property()], Metadata :: metadata(), Value :: binary() | [binary()] | Headers | Links | undefined, Headers :: [{binary(), binary()}], Links :: #{binary() => [map()]}.
Fetch properties of the fetched metadata