View Source SpiderMan.Configuration (spider_man v0.4.6)
Handle settings for spider
startup-spiders
Startup Spiders
config :spider_man, :spiders, [
SpiderA,
{SpiderB, settings = [...]},
...
]
All Spider what defined on :spiders
would auto startup while the :spider_man
application started.
global-settings
Global Settings
config :spider_man, global_settings: settings = [...]
This settings
work for all spiders.
settings-for-spider-on-config-files
Settings for Spider on config files
config :spider_man, SpiderA, settings = [...]
This settings
only work for SpiderA
.
default-settings
Default Settings
[
downloader_options: [
producer: SpiderMan.Producer.ETS,
processor: [max_demand: 1],
rate_limiting: [allowed_messages: 10, interval: 1000],
pipelines: [SpiderMan.Pipeline.DuplicateFilter],
post_pipelines: [],
context: %{}
],
spider_options: [
producer: SpiderMan.Producer.ETS,
processor: [max_demand: 1],
pipelines: [],
post_pipelines: [],
context: %{}
],
item_processor_options: [
producer: SpiderMan.Producer.ETS,
storage: SpiderMan.Storage.JsonLines,
pipelines: [SpiderMan.Pipeline.DuplicateFilter],
post_pipelines: [],
context: %{},
batchers: [default: [concurrency: 1, batch_size: 50, batch_timeout: 1000]]
]
]
settings-priority
Settings Priority
- Settings for Spider directly.
1.1
settings
defined inspiders
for the Spider. 1.2 As second argument while callSpiderMan.start/2
. - Return by callback function:
SpiderModule.settings/0
. - Settings for Spider on config files.
- Global Settings.
- Default Settings.