I have table employee_1
in Spark with attributes id and name(with data), and another table, employee_2
, with the same attributes. I want to load the data by increasing the id values with +1.
My WITH clause is shown below:
WITH EXP AS (SELECT ALIASNAME.ID+1 ID, ALIASNAME.NAME NAME FROM employee_1 ALIASNAME)
INSERT INTO TABLE employee_2 SELECT * FROM EXP;
Steps of execution:
I have a file (with data) in an HDFS location.
- Creating an RDD based on the HDFS location.
- RDD to a Hive temporary table
- From the temporary table to the Hive Target (employee_2).
When I am running with a test program from the backend, it’s succeeding. But the data is not loading. employee_2
is empty.
Note:
If you run the above WITH clause in Hive, it will succeed and the data will load. But in Spark it won't in 1.6. Why?
The WITH statement is not the problem, but rather the INSERT INTO statement that's causing trouble.
Here's a working example that uses the .insertInto() style instead of the "INSERT INTO" SQL:
Another option is to use the
df.write.mode("append").saveAsTable("edf_final")
style.Relevant SO: "INSERT INTO ..." with SparkSQL HiveContext