Cannot Create table with spark SQL : Hive support is required to CREATE Hive TABLE (AS SELECT);

6.3k views Asked by At

I'm trying to create a table in spark (scala) and then insert values from two existing dataframes but I got this exeption:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Hive support is required to CREATE Hive TABLE (AS SELECT);;
'CreateTable `stat_type_predicate_percentage`, ErrorIfExists 

Here is the code :

case class stat_type_predicate_percentage (type1: Option[String], predicate: Option[String], outin: Option[INT], percentage: Option[FLOAT])
object LoadFiles1 {

 def main(args: Array[String]) {
    val sc = new SparkContext("local[*]", "LoadFiles1") 
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    val warehouseLocation = new File("spark-warehouse").getAbsolutePath
    val spark = SparkSession
        .builder()
        .appName("Spark Hive Example")
        .config("spark.sql.warehouse.dir", warehouseLocation)
        .enableHiveSupport()
        .getOrCreate()       

import sqlContext.implicits._    
import org.apache.spark.sql._       
import org.apache.spark.sql.Row;
import org.apache.spark.sql.types.{StructType,StructField,StringType};

//statistics 
val create = spark.sql("CREATE TABLE stat_type_predicate_percentage (type1 String, predicate String, outin INT, percentage FLOAT) USING hive")
val insert1 = spark.sql("INSERT INTO stat_type_predicate_percentage SELECT types.type, res.predicate, 0, 1.0*COUNT(subject)/(SELECT COUNT(subject) FROM MappingBasedProperties AS resinner WHERE res.predicate = resinner.predicate) FROM MappingBasedProperties AS res, MappingBasedTypes AS types WHERE res.subject = types.resource GROUP BY res.predicate,types.type")

val select = spark.sql("SELECT * from stat_type_predicate_percentage" ) 
  }

How should I solve it?

2

There are 2 answers

0
Andy Quiroz On

--- Yo have to enable hive support in you sparksession

val spark = new SparkSession
    .Builder()
      .appName("JOB2")
      .master("local")
      .enableHiveSupport()
      .getOrCreate()
1
Ilya Brodezki On

This problem may be two fold for one you might want to do what @Tanjin suggested in the comments and it might work afterwards ( Try adding .config("spark.sql.catalogImplementation","hive") to your SparkSession.builder ) but if you actually want to use an existing hive instance with its own metadata which you'll be able to query from outside your job. Or you might already want to use existing tables you might like to add to you configuration the hive-site.xml.

This configuration file contains some properties you probably want like the hive.metastore.uris which will enable your context add a new table which will be save in the store. And it will be able to read from tables in your hive instance thanks to the metastore which contains tables and locations.