python - Scrapy Limit Requests For Testing

Question

Welcome To Ask or Share your Answers For Others

python - Scrapy Limit Requests For Testing

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Scrapy Limit Requests For Testing

I've been searching the scrapy documentation for a way to limit the number of requests my spiders are allowed to make. During development I don't want to sit here and wait for my spiders to finish an entire crawl, even though the crawls are pretty focused they can still take quite awhile.

I want the ability to say, "After x requests to the site I'm scraping stop generating new requests."

I was wondering if there is a setting for this I may have missed or some other way to do it using the framework before I try to come up with my own solution.

I was considering implementing a downloader middleware that would keep track of the number of requests being processed and stop passing them to the downloader once a limit has been reached. But like I said I'd rather use a mechanism already in the framework if possible.

Any thoughts? Thank you.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T21:28:49+0000

You are looking for the CLOSESPIDER_PAGECOUNT setting of the CloseSpider extension:

An integer which specifies the maximum number of responses to crawl. If the spider crawls more than that, the spider will be closed with the reason closespider_pagecount. If zero (or non set), spiders won’t be closed by number of crawled responses.

Categories

python - Scrapy Limit Requests For Testing

python - Scrapy Limit Requests For Testing

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags