python - aiohttp: rate limiting parallel requests

Question

Welcome To Ask or Share your Answers For Others

python - aiohttp: rate limiting parallel requests

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - aiohttp: rate limiting parallel requests

APIs often have rate limits that users have to follow. As an example let's take 50 requests/second. Sequential requests take 0.5-1 second and thus are too slow to come close to that limit. Parallel requests with aiohttp, however, exceed the rate limit.

To poll the API as fast as allowed, one needs to rate limit parallel calls.

Examples that I found so far decorate session.get, approximately like so:

session.get = rate_limited(max_calls_per_second)(session.get)

This works well for sequential calls. Trying to implement this in parallel calls does not work as intended.

Here's some code as example:

async with aiohttp.ClientSession() as session:
    session.get = rate_limited(max_calls_per_second)(session.get)
    tasks = (asyncio.ensure_future(download_coroutine(  
          timeout, session, url)) for url in urls)
    process_responses_function(await asyncio.gather(*tasks))

The problem with this is that it will rate-limit the queueing of the tasks. The execution with gather will still happen more or less at the same time. Worst of both worlds ;-).

Yes, I found a similar question right here aiohttp: set maximum number of requests per second, but neither replies answer the actual question of limiting the rate of requests. Also the blog post from Quentin Pradet works only on rate-limiting the queueing.

To wrap it up: How can one limit the number of requests per second for parallel aiohttp requests?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T01:04:07+0000

If I understand you well, you want to limit the number of simultaneous requests?

There is a object inside asyncio named Semaphore, it works like an asynchronous RLock.

semaphore = asyncio.Semaphore(50)
#...
async def limit_wrap(url):
    async with semaphore:
        # do what you want
#...
results = asyncio.gather([limit_wrap(url) for url in urls])

updated

Suppose I make 50 concurrent requests, and they all finish in 2 seconds. So, it doesn't touch the limitation(only 25 requests per seconds).

That means I should make 100 concurrent requests, and they all finish in 2 seconds too(50 requests per seconds). But before you actually make those requests, how could you determine how long will they finish?

Or if you doesn't mind finished requests per second but requests made per second. You can:

async def loop_wrap(urls):
    for url in urls:
        asyncio.ensure_future(download(url))
        await asyncio.sleep(1/50)

asyncio.ensure_future(loop_wrap(urls))
loop.run_forever()

The code above will create a Future instance every 1/50 second.

Categories

python - aiohttp: rate limiting parallel requests

python - aiohttp: rate limiting parallel requests

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

updated

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags