Is there a way through which we can write our Apache Crunch output to S3 bucket. There is a method in crunch pipeline write which takes Target as parameter. Is there a way to add S3 as Target to write method of crunch.
How to write output of Apache Crunch to Amazon S3 bucket
80 views Asked by Sam At
1
Couldn't you just use the write method on your PCollection and supply it to your S3 location?
This essentially is how we do it, however we are running within EMR. For migrating data from our on-prem cluster, we utilize the Hadoop dist-cp command.