crawlie v0.2.0-alpha1 Crawlie
The simple Elixir web crawler.
Summary
Functions
Crawls the urls provided in source
, using the Crawlie.ParserLogic
provided
in parser_logic
Functions
crawl(source, parser_logic, options \\ [])
crawl(Stream.t, module, Keyword.t) :: Experimental.Flow.t
Crawls the urls provided in source
, using the Crawlie.ParserLogic
provided
in parser_logic
.
The options
are used to tweak the crawler’s behaviour. You can use most of
the options for HttPoison,
as well as Crawlie specific options.
arguments
source
- aStream
or anEnum
containing the urls to crawlparser_logic
- aCrawlie.ParserLogic
behaviour implementationoptions
- options
Crawlie options
:http_client
- module implementing theCrawlie.HttpClient
behaviour to be used to make the requests. If not provided, will default toCrawlie.HttpClient.HTTPoisonClient
.:mock_client_fun
- If you’re using theCrawlie.HttpClient.MockClient
, this would be theurl -> {:ok, body :: String.t} | {:error, term}
function simulating making the requests.:min_demand
,:max_demand
- see Flow documentation for details:max_depth
- maximum crawling “depth”.0
by default.:max_retries
- maximum amount of tries Crawlie should try to fetch any individual page before giving up. By default3
.