Sync S3 Objects Across Bucket with Intelligent Tiering

121 views Asked by At

We want to sync all data from a us-east-1 bucket to us-west-2 bucket. However, we have intelligent tiering enabled. The us-east-1 bucket has thousands of objects where a large portion of them can be large (~2-10 GB). We’re finding that these files are taking VERY long to sync – with estimated weeks to complete. We’re running the following command:

aws s3 sync s3://bucketname-us-east-1/folder/year=2021/ s3://bucketname-us-west-2/folder/year=2021/ --storage-class GLACIER_IR >>bucketname-20230111.out --force-glacier-transfer --source us-east-1 --region us-west-2

We have considered syncing down to the day partition, but it will still run a long time as we have about 1 and ½ year of data to sync and (~3000 files/objects):

aws s3 sync s3://bucketname-us-east-1/folder/year=2021/month=01/day=01/ s3://bucketname-us-west-2/folder/year=2021/month=01/day=01/ --storage-class GLACIER_IR >>bucketname-20230111.out --force-glacier-transfer --source us-east-1 --region us-west-2

What we’re interested in understanding is the best approach to quickly sync the data. Does AWS offer any utility that would assist with this? Or are we stuck with running these sync commands for awhile?

We have tried running the sync command at the month and day partition. In both cases, the process still take awhile.

1

There are 1 answers

0
arjunrawal On BEST ANSWER

Have you considered using S3 Replication or an S3 Batch Operations Copy? If you want the data to be copied continuously, then replication may be a good option. These tools will automatically use parallelism and can run in the background.

https://docs.aws.amazon.com/AmazonS3/latest/userguide/batch-ops-copy-object.html

https://docs.aws.amazon.com/AmazonS3/latest/userguide/replication.html

If you want to stay with sync, there are some suggestions here on how to improve performance but with large objects it still may take a long time https://repost.aws/knowledge-center/s3-improve-transfer-sync-command