Is the HDFS sink in Flume using a "anti-pattern" with it's default config

304 views Asked by At

looking at the HDFS sink default parameters in Apache Flume it seems that this will produce tons of very small files (1 kB rolls). From what I learned about GFS/HDFS is that blocksizes are 64MB and filesizes should rather be gigabytes to make sure everything runs efficiently.

So I'm curious whether the default parameters of Flume are just misleading or whether I missed something else here.

Cheers.

0

There are 0 answers