Apache Spark 3.4.1 version with hudi 0.11.0 version slowness

129 views Asked by At

I am using spark , hudi and hadoop & java8 , AWS S3 in my project.

In my spark jobs , I was using spark 2.4.5 version, hadoop 2.9.1 version with apache-hudi-0.8.0 version and wrote data on S3 path of AWS. Recently I upgraded to latest version of Spark 3.x. i.e. I moved all my spark jobs to spark 3.x version , with hadoop 3 version hence tried with hudi 0.8.0, 0.11.0 , 0.14.0 but jobs are damns slow in reading and writing on existing S3 file paths data.

Are there any settings I need to change in Job ? There seems to be some issues with library/jar compatibility though the job does not throw any error or exception or warning .

How to fix this slowness problem ? Any help in this regard is highly thankful.

0

There are 0 answers