I have a view which uses max to show the latest partition (which is of format 2021-01
, 2021-02
, 2021-03
, 2021-04
). The hive table has _HIVE_DEFAULT_PARTITION__
too.
When we run the query in Impala, max on partitions gives the correct value of 2021-04
ignoring _HIVE_DEFAULT_PARTITION__
but the same do not work when we run the query in Hive as it returns _HIVE_DEFAULT_PARTITION__
Is there any way to make Hive query ignore the default partition if exists while returning max on that column?
You can filter it:
If you do not need data in
__HIVE_DEFAULT_PARTITION__
, you can drop it:Transforming
__HIVE_DEFAULT_PARTITION__
to NULL can be a solution if withmax(partition_col)
you want to aggregate something else and do not want to excluse__HIVE_DEFAULT_PARTITION__
partition: