DynamoDB data load after transforming files. Any AWS service like GCP Dataflow/Apache Beam?

Question

DynamoDB data load after transforming files. Any AWS service like GCP Dataflow/Apache Beam?

301 views Asked by jconnor198 At 05 December 2020 at 08:01

New to AWS. I have a requirement to create a daily batch pipeline

Read 6-10 1GB+ CSV Files. (Each file is an extract of a table from a SQL db.)
Transform each file with some logic and join all files to create one item per id.
Load this joined data in a single DynamoDB table with an upsert logic.

Current approach I have started with is: We have an EC2 available used for such tasks. So I am writing a python code to (1) read all CSVs, (2) convert to a denormalised JSON file and (3) import into Dynamodb using boto3

My question is that I am concerned if my data is "Big Data". Is processing 10GB data with a single Python script ok? And down the line if the file sizes become 10x, will I face scaling issues? I have only worked with GCP in the past and in this scenario I would have used DataFlow to get the task done. So is there an equivalent in AWS terms? Would be great if someone can provide some thoughts. Thanks for your time.

Original Q&A

There are 2 answers

Steven Ensslen On 06 December 2020 at 20:16

The AWS equivalent to Google Cloud Dataflow is AWS Glue. The documentation isn't clear but Glue does write to DynamoDB.

**Pablo** · Accepted Answer · 2020-12-07T15:31:43+00:00

Pablo On 07 December 2020 at 15:31 BEST ANSWER

A more appropriate equivalent of Dataflow in AWS is Kinesis Data Analytics, which supports Apache Beam's Java SDK.

You can see an example of an Apache Beam pipeline running on their service.

Apache Beam is able to write to DynamoDB.

Good luck!

TechQA.

DynamoDB data load after transforming files. Any AWS service like GCP Dataflow/Apache Beam?

There are 2 answers

Related Questions in PYTHON-3.X

Related Questions in AMAZON-DYNAMODB

Related Questions in ETL

Related Questions in GOOGLE-CLOUD-DATAFLOW

Related Questions in DATA-PIPELINE

Popular Questions

Popular Tags

Trending Questions