I successfully Copied Data From a Source Table to a Partitioned Table Using Hive but The Data Can't be found In The Partitioned Table

Question

I successfully Copied Data From a Source Table to a Partitioned Table Using Hive but The Data Can't be found In The Partitioned Table

91 views Asked by Aye At 14 January 2022 at 23:26

First of all, I load data from a local file to an unpartitioned table using hive

Creating raw data file:

stephen@stephen-VirtualBox:~/Workspace$ cat> static_source_demo.txt
11,test,2300,admin,c1 
12,test2,2220,IT,c2 
21,test3,2342,admin,c1 
34,test5,2422,admin,c2 
35,test6,2411,admin1,c1

Creating Unpartitioned Table

hive> CREATE TABLE employee_source_demo ( eid int, name string, 
>     salary string, destination string,city string) 
>     ROW FORMAT DELIMITED 
>     FIELDS TERMINATED BY ',';
OK
Time taken: 0.153 seconds
hive>

I then load the data from the file to the source table

Loading Data to Source Table:

hive> load data local inpath '/home/stephen/Workspace/static_source_demo.txt' into table employee_source_demo; 
Loading data to table zipcodes.employee_source_demo
OK
Time taken: 0.773 seconds

I Confirmed if the data is in the table

 hive> SELECT * FROM employee_source_demo;
OK
11  test    2300    admin   c1 
12  test2   2220    IT  c2 
21  test3   2342    admin   c1 
34  test5   2422    admin   c2 
35  test6   2411    admin1  c1 
Time taken: 0.228 seconds, Fetched: 5 row(s)
hive>

Now, I create the partitioned table in the same database

hive> CREATE TABLE  employee_part1 ( eid int, name String, 
    > salary String, destination String) PARTITIONED by (city string) 
    > ROW FORMAT DELIMITED 
    > FIELDS TERMINATED BY ','; 
OK
Time taken: 0.151 seconds
hive>

After that, I tried to insert data into the new table, taking partition into account.

hive> INSERT INTO TABLE employee_part1  PARTITION (city='c1') SELECT eid, name, salary, 
destination FROM employee_source_demo WHERE city='c1';

I think every thing went well. The following are the messages I got during the building/executing process

WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = stephen_20220114230858_14789266-c13d-4e53-b411-474ac5bcbde7
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1642123763239_0009, Tracking URL = http://stephen-VirtualBox:8088/proxy/application_1642123763239_0009/
Kill Command = /home/stephen/opt/hadoop-2.7.3/bin/hadoop job  -kill job_1642123763239_0009
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2022-01-14 23:09:11,139 Stage-1 map = 0%,  reduce = 0%
2022-01-14 23:09:22,019 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.23 sec
MapReduce Total cumulative CPU time: 2 seconds 230 msec
Ended Job = job_1642123763239_0009
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://localhost:9000/user/hive/warehouse/zipcodes.db/employee_part1/city=c1/.hive-staging_hive_2022-01-14_23-08-58_343_2569185367107439586-1/-ext-10000
Loading data to table zipcodes.employee_part1 partition (city=c1)
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 2.23 sec   HDFS Read: 5027 HDFS Write: 57 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 230 msec
OK
Time taken: 25.388 seconds

I don't think there was error. So, I checked the partitioned table as shown below. Nothing in there.

hive> SELECT * FROM employee_part1;
OK
Time taken: 0.34 seconds
hive>

I also checked the hive warehouse. It's like the file is there but no data

hive> !hadoop fs -ls /user/hive/warehouse/zipcodes.db/employee_part1/city=c1;
Found 1 items
-rwxrwxr-x   1 stephen supergroup          0 2022-01-14 23:09 /user/hive/warehouse/zipcodes.db/employee_part1/city=c1/000000_0
hive> !hadoop fs -cat /user/hive/warehouse/zipcodes.db/employee_part1/city=c1/000000_0;
hive>

I would appreciate any solution to this. I don't know what I have done wrong.

Original Q&A

There are 1 answers

**yahoo** · Answer 1 · 2022-01-16T12:10:29+00:00

yahoo On 16 January 2022 at 12:10

Try below statement

'MSCK repair table employee_part1 '

TechQA.

I successfully Copied Data From a Source Table to a Partitioned Table Using Hive but The Data Can't be found In The Partitioned Table

There are 1 answers

Related Questions in APACHE-SPARK

Related Questions in HIVE

Related Questions in HDFS

Related Questions in HIVEQL

Related Questions in HADOOP2

Popular Questions

Trending Questions