I have a view which uses max to show the latest partition (which is of format 2021-01, 2021-02, 2021-03, 2021-04). The hive table has _HIVE_DEFAULT_PARTITION__ too.
When we run the query in Impala, max on partitions gives the correct value of 2021-04 ignoring _HIVE_DEFAULT_PARTITION__ but the same do not work when we run the query in Hive as it returns _HIVE_DEFAULT_PARTITION__
Is there any way to make Hive query ignore the default partition if exists while returning max on that column?
You can filter it:
If you do not need data in
__HIVE_DEFAULT_PARTITION__, you can drop it:Transforming
__HIVE_DEFAULT_PARTITION__to NULL can be a solution if withmax(partition_col)you want to aggregate something else and do not want to excluse__HIVE_DEFAULT_PARTITION__partition: