DynamoDB ItemCount alternative

Question

DynamoDB ItemCount alternative

125 views Asked by Ashish Gupta At 04 October 2020 at 06:34

I have a use case where I would be getting records from upstream for a particular batchID along with some meta data of the batch upfront. Example, I am told a batchID="ABC" will have 2000 records. After I start getting records in my service, I do some some processing and save it in DB with status = "PROCESSED". So my use case is, once I get all 2000 records for a batchID, I have to create a CSV file with all records (2000) in this batch and send it to some other service. Also, I update the status to "SENT".

Approach 1 (Naive): Run a query on composite GSI on batchID+status and check if count matches at every request. This will very expensive.

Approach 2: Use DynamoDB's atomic counter, where key = batchID and value is a count. At every DB insert, I make sure that count is incremented. I check the count and raise trigger if count matches to expectation. But in this case there would be cases of throttle and errors (i.e. if update fails).

Had it been SQL, I would have

SELECT COUNT(*) FROM records_table WHERE batchID = "ABC

I wanted to know if there's some hybrid approach in AWS that I can leverage to solve this use case.

Original Q&A

There are 1 answers

**webjaros** · Accepted Answer · 2020-10-04T10:32:09+00:00

I'd suggest using another table for batch indexing and processed record amount tracking. You could use DynamoDB stream to run lambda, which updates the amount in case of need (when the desired status is set). Also the very same lambda function would check if amount reached 2k and trigger another lambda function which does the sending. Below is more detailed architecture description.

DynamoDBDataTable

PK some data
GSI batchID
Data {status, ...someOtherData}

DynamoDBBatchIndexingTable

PK batchID
Data {amountOfProcessedItems, isSent}

Lambda1

Triggered by DynamoDBDataTable stream
If status of a record in the stream changed to "PROCESSED" it updates amountOfProcessedItems of the PK = batchId in DynamoDBBatchIndexingTable with +1
If amountOfProcessedItems is now 2000, triggers Lambda2.

Lambda2

Triggered by Lambda 1.
Gets all the records based on GSI on batchID
Creates CSV file and sends it to some other service. You will need at least 1GB ram lambda for this.
Updates DynamoDBBatchIndexingTable sets isSent = true
Updates all the records of DynamoDBDataTable with GSI = batchID with status="SENT". Maybe in your case just changing isSent is enough maybe not - I don't have enough detail about the context.

TechQA.

DynamoDB ItemCount alternative

There are 1 answers

Related Questions in SQL

Related Questions in AMAZON-WEB-SERVICES

Related Questions in AMAZON-DYNAMODB

Related Questions in AMAZON-KINESIS-FIREHOSE

Related Questions in AMAZON-DYNAMODB-STREAMS

Popular Questions

Popular Tags

Trending Questions