Need to merge multiple hive partitions into one partition in spark

544 views Asked by At

I have around 50 partitions in hive table. I need to merge each set of partitions into one partition. I tried to use rename partition command. But getting error message.

Need help in merging multiple hive partitions into one partition in spark

 ALTER TABLE db.table PARTITION (appname='SCORING',indicator='segment_id:1|process_date:20220417') RENAME TO PARTITION (appname='SCORING',indicator='process_date:20220417')

 ALTER TABLE db.table PARTITION (appname='SCORING',indicator='segment_id:3|process_date:20220417') RENAME TO PARTITION (appname='SCORING',indicator='process_date:20220417')

 ALTER TABLE db.table PARTITION (appname='SCORING',indicator='segment_id:4|process_date:20220417') RENAME TO PARTITION (appname='SCORING',indicator='process_date:20220417')

org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename partition. Partition already exists:db.table

1

There are 1 answers

0
Matt Andruff On

You can do this by using a sql statement distribute by.

In spark programmign language there are more tools to change the partitions.

You can use partitionby to repartition in spark.

or you could write a select to grab the partitioned data. Then you can use coalece or repartition to create 1 partition.