readability v0.9.0 Readability
Readability library for extracting & curating articles.
Example
@type html :: binary
# Just pass url
%Readability.Summary{title: title, authors: authors, article_html: article} = Readability.summarize(url)
# Extract title
Readability.title(html)
# Extract authors.
Readability.authors(html)
# Extract only text from article
article = html
|> Readability.article
|> Readability.readable_text
# Extract article with transformed html
article = html
|> Readability.article
|> Readability.raw_html
Link to this section Summary
Functions
Using a variety of metrics (content score, classname, element types), find the content that is most likely to be the stuff a user wants to read
Extract authors
Return true if Content-Type in provided headers list is a markup type, else false
Extract MIME Type from headers
return raw html binary from html_tree
return attributes, tags cleaned html
return only text binary from html_tree
summarize the primary readable content of a webpage
Extract title
Link to this section Types
Link to this section Functions
Using a variety of metrics (content score, classname, element types), find the content that is most likely to be the stuff a user wants to read
Example
iex> article_tree = Redability(html_str)
# returns article that is tuple
Return true if Content-Type in provided headers list is a markup type, else false
Example
iex> Readability.is_response_markup?([{"Content-Type", "text/html"}])
true
return raw html binary from html_tree
return attributes, tags cleaned html
return only text binary from html_tree
summarize the primary readable content of a webpage.