In databricks, I need to use a company package to interact with data. This package is implemented in the following way:

self.dataframe = (
    spark.read.format(self.format)
    .options(**self.options.to_dict())
    .load(self.data_path)
    )

Therefore I am looking for a way to read a hive table using this syntax.

I tried the following calls:

spark.read.format("hive").load("hive_metastore.default.my_table")
spark.read.format("hive").load("default.my_table")
spark.read.format("hive").load("my_table")
spark.read.format("hive").load("/user/hive/warehouse/my_table")

but every attempt returns: AnalysisException: Hive data source can only be used with tables, you can not read files of Hive data source directly.

So format("hive") seems to be acceptable (although I cannot find any reference online), but the right input remains a mystery to me...

(Note that I am aware of the spark.read.table call, which works fine by the way )

0

There are 0 answers