Thrift server with Hudi (dbt)

134 views Asked by At

I'm trying to set up Spark as engine in DBT.

At first I was trying with Thriftserver in EMR 6.11, as per the docs, but it's failing with the message stating it can't find the Hudi package

20:59:12      Failed to find data source: hudi. Please find packages at
20:59:12      https://spark.apache.org/third-party-projects.html

I believe I had to force the packages in the server, so I tried

sudo /usr/lib/spark/sbin/start-thriftserver.sh --packages org.apache.hudi:hudi-spark3.3-bundle_2.12:0.13.1  --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'

But failed again with the same message. Running the sample repo here works just fine.

I then read about Kyuubi as a Thrift replacement but I end up stumbling upon the same error, even though I set the jars location in the spark-defaults.conf, as suggested here

This issue states to be an issue with the 0.13.1 version, but even using the 0.12.X shows the same behaviour.

Any ideas ?

0

There are 0 answers