According to AWS documentation, S3 can publish "New object created events" to following destinations:
- Amazon SNS
- Amazon SQS
- AWS Lambda
In your case I would:
- Create SQS.
- Configure S3 Bucket to publish S3 new object events to SQS.
- Reconfigure your existing Lambda to subscribe to SQS.
- Configure batching for input SQS events.
Currently, the maximum batch size for SQS-Lambda subscription is 1000 events. But since your Lambda needs around 2 seconds to process single event, then you should start with something smaller, otherwise Lambda will timeout, because it won't be able to process all of the events.
Thanks to this, uploading X items to S3 will produce X / Y
events, where Y is maximum batch size of SQS. For 1000 S3 items and batch size of 100, it will only invoke around 10 concurrent Lambda executions.
The AWS document mentioned above explains, how to publish S3 events to SQS. I won't explain it here, as it's more about implementation details.
Execution time
However you might run into a problem, where the processing is too slow, because Lambda will be processing probably events one-by-one in a loop.
The workaround would be to use asynchronous processing and implementation depends what runtime you use for Lambda, for Node.js it would be very easy to achieve.
Also if you want to speed up the processing in other ways, simply reduce maximum batch size and increase Lambda memory configuration, so single execution will be processing smaller number of events and will have access to more CPU units.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…