Crawly v0.2.0 Crawly.Spider behaviour View Source
A behavior module for implementing a Crawly Spider
A Spider is a module which is responsible for defining:
init/0
function, which must return a keyword list with start_urls listbase_url/0
function responsible for filtering out requests not related to a given websiteparse_item/1
function which is responsible for parsing the downloaded request and converting it into items which can be stored and new requests which can be scheduled
Link to this section Summary
Link to this section Callbacks
Link to this callback
base_url()
View Source
base_url()
View Source
base_url() :: binary()
base_url() :: binary()
Link to this callback
init()
View Source
init()
View Source
init() :: [{:start_urls, list()}]
init() :: [{:start_urls, list()}]
Link to this callback
parse_item(response)
View Source
parse_item(response)
View Source
parse_item(response :: HTTPoison.Response.t()) :: Crawly.ParsedItem.t()
parse_item(response :: HTTPoison.Response.t()) :: Crawly.ParsedItem.t()