SpiderMan.start
start
, go back to SpiderMan module for more information.
start(spider, settings \\ [])
Specs
start(spider(), settings()) :: Supervisor.on_start_child()
start a spider
Settings
:log2file
- The default value istrue
.:status
- The default value is:running
.:spider_module
:ets_file
:downloader_options
:spider_options
:item_processor_options
Downloader options
:requester
- The default value is{{SpiderMan.Requester.Finch, []}}
.:producer
- The default value isSpiderMan.Producer.ETS
.:context
- The default value is%{}
.:processor
- The default value is[max_demand: 1]
.:stages
:concurrency
- The default value is8
.:min_demand
:max_demand
- The default value is10
.:partition_by
:spawn_opt
:hibernate_after
:rate_limiting
- The default value is[allowed_messages: 10, interval: 1000]
.:allowed_messages
- Required.:interval
- Required.
:pipelines
- The default value is[SpiderMan.Pipeline.DuplicateFilter]
.:post_pipelines
- The default value is[]
.
Spider options
:producer
- The default value isSpiderMan.Producer.ETS
.:context
- The default value is%{}
.:processor
- The default value is[max_demand: 1]
.:stages
:concurrency
- The default value is8
.:min_demand
:max_demand
- The default value is10
.:partition_by
:spawn_opt
:hibernate_after
:rate_limiting
:allowed_messages
- Required.:interval
- Required.
:pipelines
- The default value is[]
.:post_pipelines
- The default value is[]
.
Batchers options
:concurrency
- The default value is1
.:batch_size
- The default value is100
.:batch_timeout
- The default value is1000
.:partition_by
:spawn_opt
:hibernate_after
ItemProcessor options
:storage
- The default value isSpiderMan.Storage.JsonLines
.:batchers
- The default value is[default: [concurrency: 1, batch_size: 50, batch_timeout: 1000]]
.:producer
- The default value isSpiderMan.Producer.ETS
.:context
- The default value is%{}
.:processor
- The default value is[]
.:stages
:concurrency
- The default value is8
.:min_demand
:max_demand
- The default value is10
.:partition_by
:spawn_opt
:hibernate_after
:rate_limiting
:allowed_messages
- Required.:interval
- Required.
:pipelines
- The default value is[SpiderMan.Pipeline.DuplicateFilter]
.:post_pipelines
- The default value is[]
.