I'm trying to set up Spark as engine in DBT.
At first I was trying with Thriftserver in EMR 6.11, as per the docs, but it's failing with the message stating it can't find the Hudi package
20:59:12 Failed to find data source: hudi. Please find packages at
20:59:12 https://spark.apache.org/third-party-projects.html
I believe I had to force the packages in the server, so I tried
sudo /usr/lib/spark/sbin/start-thriftserver.sh --packages org.apache.hudi:hudi-spark3.3-bundle_2.12:0.13.1 --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
But failed again with the same message. Running the sample repo here works just fine.
I then read about Kyuubi as a Thrift replacement but I end up stumbling upon the same error, even though I set the jars location in the spark-defaults.conf, as suggested here
This issue states to be an issue with the 0.13.1 version, but even using the 0.12.X shows the same behaviour.
Any ideas ?