Optimum Batch and Window Size for Real-time Processing with Kinesis and Lambda

37 views Asked by At

I have a system where 300+ IoT devices are sending data every 10 seconds. The average message size is less than 1 KB. I'm using Kinesis Data Streams for data ingestion and AWS Lambda for processing. I have provisioned 2 shards in Kinesis. I need to find the optimal batch size and window size to keep the system real-time.

Here are the details of my setup:

  • Number of IoT devices: 300+
  • Data generation frequency: 10 seconds
  • Average data size: < 1 KB per message
  • Data ingestion platform: Kinesis Data Streams
  • Number of Kinesis shards: 2
  • Processing platform: AWS Lambda

One formula I came across to calculate the batch size is this:-

batch_size = desired_latency / data_generation_frequency

Is the above mentioned formula correct for calculating the optimal batch size?

Additionally, is there a similar formula for calculating the optimal window size?

Questions:

  • What is the optimal batch size for processing data in Lambda to maintain real-time performance?
  • What is the optimal window size for batching data in Kinesis to ensure efficient processing?
  • Should I consider using dedicated consumer throughput or Enhanced Fan-out (EFO) for higher throughput?
  • What are some additional factors to consider when optimizing data batching for real-time processing?

I'm open to any suggestions or best practices for optimizing my Kinesis and Lambda configuration for real-time processing.

0

There are 0 answers