I use python's boto.kinesis
module to write records to AWS Kinesis. The maximum throughput that is reached is about 40 puts/sec. However, according to the Kinesis FAQ:
Each shard can support up to 1000 PUT records per second.
So my current approach reaches only 4% what is theoretically possible, which seems terribly low.
Does anyone have an idea how the throughput can be improved?
Setup: The Kinesis Stream is an instance with one shard. The producer is on a dedicated AWS EC2 instance (t3.medium) in the same region as the Kinesis Stream. It creates strings of about 20 characters lengths and sends them to the Kinesis Stream via boto.kinesis.Connection.put_record("my_stream", my_message)
.
Simplified code:
from boto import kinesis
import time
connection = kinesis.connect_to_region(REGION)
stream = connection.create_stream("my_stream", shard_count=1)
time.sleep(60) # wait a minute until stream is created
for i in range(NUM_MESSAGES):
my_message = "This is message %d" % i
connection.put_record(my_message, "my_stream", "partition_key")
http://docs.aws.amazon.com/kinesis/latest/dev/service-sizes-and-limits.html
The limit is for records/second you should use putRecords to improve write throughput. the way you do that is that you place multiple records inside the same call. so you keep appending and the end you do the put records.
also
take a look at: https://github.com/awslabs/kinesis-poster-worker