Crawly v0.2.0 Crawly.Middlewares.RobotsTxt View Source
Obey robots.txt
A robots.txt file tells search engine crawlers which pages or files the crawler can or can't request from your site. This is used mainly to avoid overloading a site with requests!
Please NOTE: The first rule of web crawling is you do not harm the website. The second rule of web crawling is you do NOT harm the website
Link to this section Summary
Functions
Callback implementation for Crawly.Pipeline.run/2
.
Link to this section Functions
Link to this function
run(request, state) View Source
Callback implementation for Crawly.Pipeline.run/2
.