AvroParquetWriter.<GenericRecord>builder(filePath)
.withSchema(schema)
.withCompressionCodec(CompressionCodecName.SNAPPY)
.withConf(Configuration)
.withDataModel(GenericData.get())
.withWriteMode(Mode.OVERWRITE)
.withRowGroupSize(8*1024*10124)
.withPageSize(64*1024*1024)
.build()
For Path I am using a logic path = "hdfsLocation"+String.format(tid%num of rows per file),counter/number of rows per file)+_parquet. With these I am able to achieve file size of 382 KB, but I need file size around 100 MB. Please share some solution.