I have a lambda function that accepts a parameter i.e a category_id, pulls some data from an API, and updates the database based on the response.
I have to execute the same lambda function for Multiple Ids after an interval of 1 minute on daily basis.
For example, run lambda for category 1 at 12:00 AM, then run for category 2 at 12:01 AM and so one for 500+ categories.
What could be the best possible solution to achieve this?
This is what I am currently thinking:
- Write Lambda using AWS SAM
- Add Lambda Layer for Shared Dependencies
- Attach Lambda with AWS Cloudwatch Events to run it on schedule
- Add Environment Variable for category_id in lambda
- Update the SAM template to use the same lambda function again and again but only change will be in the Cron expression schedule and Value of Environment Variable category_id
Problems in the Above Solution:
- Number of Lambda functions will increase in the account.
- Each Lambda will be attached with a Cloudwatch Event so its number will also increase
- There is a quota limit of max 300 Cloudwatch Event per account (though we can request support to increase that limit)
- It'll require the use of nested stacks because of the SAM template size limit as well as the number of resources per template which 200 max.
- I'll be able to create only 50 Lambda Functions per nested stack, it means the number of nested stacks will also increase because 1 lambda = 4 resources (Lambda + Role + Rule + Event)
Other solutions (not sure if they can be used):
- Use of Step Functions
- Trigger First Lambda function only using Cron Schedule and Invoke Lambda for the next category using current lambda(only one CloudWatch Event will be required to invoke the function for the first category but time difference will vary i.e next lambda will not execute exactly after one minute).
- Use Only One Lambda and One Cloud Watch Schedule Event, Lambda Function will have a list of all category ids and that function will invoke itself recursively by using one category id at a time and removing the use category id from the list (the only problem is lambda will not execute exactly after one minute for next category_id in the list)
Looking forward to hearing about the best solution.
Given that you are doing a large amount of processing, an Amazon EC2 instance might be more appropriate.
If the bandwidth requirements are low (eg if it is just making API calls), then a T3a.micro ($0.0094 per Hour) or even T3a.nano instance ($0.0047 per Hour) can be quite cost-effective.
A script running on the instance could process a category, then sleep for 30 seconds, in a big loop. Running 500 categories at one minute each would take about 8 hours. That's under 10c each day!
The instance can then stop or self-terminate when the work is complete. See: Auto-Stop EC2 instances when they finish a task - DEV Community