I have around 50 partitions in hive table. I need to merge each set of partitions into one partition. I tried to use rename partition command. But getting error message.
Need help in merging multiple hive partitions into one partition in spark
ALTER TABLE db.table PARTITION (appname='SCORING',indicator='segment_id:1|process_date:20220417') RENAME TO PARTITION (appname='SCORING',indicator='process_date:20220417')
ALTER TABLE db.table PARTITION (appname='SCORING',indicator='segment_id:3|process_date:20220417') RENAME TO PARTITION (appname='SCORING',indicator='process_date:20220417')
ALTER TABLE db.table PARTITION (appname='SCORING',indicator='segment_id:4|process_date:20220417') RENAME TO PARTITION (appname='SCORING',indicator='process_date:20220417')
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename partition. Partition already exists:db.table
You can do this by using a sql statement
distribute by
.In spark programmign language there are more tools to change the partitions.
You can use partitionby to repartition in spark.
or you could write a select to grab the partitioned data. Then you can use coalece or repartition to create 1 partition.