Limiting EC2 resources used by AWS data pipeline during DynamoDB table backups

Question

Limiting EC2 resources used by AWS data pipeline during DynamoDB table backups

1.3k views Asked by Radek At 17 June 2015 at 15:57

I need to backup 6 DynamoDB tables every couple of hours. I've created 6 pipeliness from templates and it ran great, except that it created 6 or more virtual machines which were mostly staying up. That's not the economy I can afford.

Does anyone have experience optimizing this kind of scenario?

Original Q&A

There are 2 answers

**Rohit Kulshreshtha** · Answer 1 · 2015-06-20T17:09:28+00:00

Some solutions that come to mind are:

One: To ensure that EC2 resources are being terminated, you can set the terminateAfter property on the EC2 resource definition. The semantics of terminate after are discussed here - How does AWS Data Pipeline run an EC2 instance?.

Two: This thread on the AWS forum discusses how existing EC2 instance may be used by data pipeline.

Three: Using the backup pipeline template always creates a single pipeline with a single Activity for the backup that reads from a single source and writes to a single destination. You can view the JSON source of the pipeline in the AWS console and write a similar pipeline with multiple Activity instances - one for each table you want to backup. Since the pipeline definition will only have one EMR resource, only that EMR resource will do the work of all the activities.

**AravindR** · Answer 2 · 2015-06-23T17:57:20+00:00

You can set the field maxActiveInstances on the Ec2Resource object.

maxActiveInstances The maximum number of concurrent active instances of a component. For activities, setting this to 1 runs instances in strict chronological order. A value greater than 1 allows different instances of the activity to run concurrently and requires you to ensure your activity can tolerate concurrent execution.

See this: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-ec2resource.html

Aravind. R

TechQA.

Limiting EC2 resources used by AWS data pipeline during DynamoDB table backups

There are 2 answers

Related Questions in AMAZON-WEB-SERVICES

Related Questions in AMAZON-DYNAMODB

Related Questions in AMAZON-DATA-PIPELINE

Popular Questions

Popular Tags

Trending Questions