readability v0.8.0 Readability
Readability library for extracting & curating articles.
Example
@type html :: binary
# Just pass url
%Readability.Summary{title: title, authors: authors, article_html: article} = Readability.summarize(url)
# Extract title
Readability.title(html)
# Extract authors.
Readability.authors(html)
# Extract only text from article
article = html
|> Readability.article
|> Readability.readable_text
# Extract article with transformed html
article = html
|> Readability.article
|> Readability.raw_html
Summary
Functions
Using a variety of metrics (content score, classname, element types), find the content that is most likely to be the stuff a user wants to read
Extract authors
return raw html binary from html_tree
return attributes, tags cleaned html
return only text binary from html_tree
summarize the primary readable content of a webpage
Extract title
Types
Functions
Using a variety of metrics (content score, classname, element types), find the content that is most likely to be the stuff a user wants to read
Example
iex> article_tree = Redability(html_str)
# returns article that is tuple
return raw html binary from html_tree
return attributes, tags cleaned html
return only text binary from html_tree
summarize the primary readable content of a webpage.