View Source SpiderMan.Pipeline.DuplicateFilter (spider_man v0.5.1)

filter msg while duplicate key

Usage

settings = [
  *_options: [
    pipelines: [SpiderMan.Pipeline.DuplicateFilter | {SpiderMan.Pipeline.DuplicateFilter, scope}]
  ],
  ...
]

Support for all component: downloader | spider | item_processor.

Scope

  • :common | [scope: :common]: save key to common_pipeline_tid.

  • :pipeline | [scope: :pipeline]: save key to pipeline_tid.

common_pipeline_tid use by all component, pipeline_tid only use by one component.