Zstandard levels in hadoop

658 views Asked by At

Compression level in org.apache.hadoop.io.compress.zstd.ZStandardCompressor does't seem to work. I see the reset function getting called in ZStandardCompressor constructor which is turn call init(level, stream) to call native function which I believe to be only place setting zstd parameter. In my test, I am ensuring that this is being called but calling it different levels like 1, 5, 10. 20 etc did not make any difference as output size is exact same.

Hadoop doesn't seem to use zstd-jni and use own stuff to use zstd. I am sure people are using different levels in hadoop. Could you someone point I should go around chasing for next step

1

There are 1 answers

0
ondway On

Given that people are finding this question without answer, I am adding solution which I used. InternalParquetRecordWriter has compressor as argument, so I integrated zstd-jni library here by creating a compressor by extending BytesInputCompressor.