is there a chance to stop crawling when specific if condition is true (like scrap_item_id == predefine_value ). My problem is similar to Scrapy - how to identify already scraped urls but I want to 'force' my scrapy spider to stop crawling after discover the last scraped item.
In the latest version of Scrapy, available on GitHub, you can raise a CloseSpider exception to manually close a spider.
In the 0.14 release note doc is mentioned: "Added CloseSpider exception to manually close spiders (r2691)"
Example as per the docs:
def parse_page(self, response): if 'Bandwidth exceeded' in response.body: raise CloseSpider('bandwidth_exceeded')
See also: http://readthedocs.org/docs/scrapy/en/latest/topics/exceptions.html?highlight=closeSpider
1.4m articles
1.4m replys
5 comments
57.0k users