funkspector v0.1.1 Funkspector.PageScraper

Provides a method to scrape a page, given its URL.

Summary

Functions

Fetches the given URL, follows redirections, and returns the data scraped from its HTML

Functions

scrape(original_url)

Fetches the given URL, follows redirections, and returns the data scraped from its HTML.

Examples

iex> { :ok, data } = Funkspector.PageScraper.scrape("http://jaimeiniesta.com")
iex> data.scheme
"http"
iex> data.host
"jaimeiniesta.com"
iex> data.root_url
"http://jaimeiniesta.com/"
iex> data.links.http.internal
["http://jaimeiniesta.com/",
 "http://jaimeiniesta.com/about/",
 "http://jaimeiniesta.com/archive/",
 "http://jaimeiniesta.com/portfolio/",
 "http://jaimeiniesta.com",
 "http://jaimeiniesta.com/articles/questions-about-getting-into-freelancing/",
 "http://jaimeiniesta.com/articles/building-a-disqus-recent-comments-widget-with-javascript/",
 "http://jaimeiniesta.com/articles/tips-for-a-new-rails-developer/",
 "http://jaimeiniesta.com/articles/fifteen-servers/",
 "http://jaimeiniesta.com/atom.xml"]
iex> data.links.http.external
["http://jekyllrb.com",
 "http://mademistakes.com/so-simple/",
 "http://twitter.com/jaimeiniesta",
 "http://facebook.com/jaime.iniesta.7",
 "http://plus.google.com/+jaimeiniesta",
 "http://linkedin.com/in/jaimeiniesta",
 "http://instagram.com/jaimeiniesta",
 "http://www.flickr.com/photos/jaimeiniesta", "http://github.com/jaimeiniesta"]
iex> data.links.non_http
["mailto:jaimeiniesta@gmail.com"]

iex> { :ok, data } = Funkspector.PageScraper.scrape("http://github.com")
iex> data.original_url
"http://github.com"
iex> data.final_url
"https://github.com/"