Hi I have a MR2 job which takes avro data compressed with snappy as input, processes it and outputs the data into an output dir into avro format. The expectation is that this output avro data should also be snappy compressed but its not. The MR job is a map only job.
I have set the following properties in my code
conf.set("mapreduce.map.output.compress", "true");
conf.set("mapreduce.map.output.compress.codec", "org.apache.hadoop.io.compress.SnappyCodec");
But still the output is not snappy compressed
The following did the trick
FileOutputFormat.setCompressOutput(job, true); FileOutputFormat.setOutputCompressorClass(job, org.apache.hadoop.io.compress.SnappyCodec.class);
Please note that this has do be done before setting the outputpath and in the same order as shown above.