I'm using EMR consistent view feature on EMR when running some of my Hive queries.
Now I need to access and copy objects directly from s3 using s3-dist-cp bypassing Hive interface which uses EMRFS consistent view metadata stored in DynamoDB.
When I looked up official docs for s3-distp-cp or other resources I haven't found definitive answer.
Per the thread in summer 2017 s3-dist-cp lacks support for EMR consistent view feature.
- Currently, s3-dist-cp on EMR releases do not completely use EMRFS and have code that directly uses the aws-java-sdk. The reasoning for this is that this would offer performance improvements over directly using EMRFS in certain cases. We have made efforts to increase usage of EMRFS in s3-dist-cp, but it is still not there yet. So, at this moment, I would recommend trying out DistCp.
https://forums.aws.amazon.com/thread.jspa?messageID=787883
Has anything changed in 2020 ?