FETCH the latest partition from HIVE table

584 views Asked by At

Hi I am very much new to this. I have three columns YEAR, MONTH,DAY in INTEGER format.

I want to load the script and combine YEAR,MONTH,DAY as single column and fetch the maximum.

I tried like,

Load year,month,date from HIVE.`abc`.`abc1';
SELECT max(cast(year as String) || '_' || cast(month as string) || '_' || cast(day as string)) as result FROM HIVE.`abc`.`abc1';

By doing this I will get the result as 2020_5_21. But I should use the separator and find the max of the date.

The following error occurred: Connector reply error: SQL##f - SqlState: S1000, ErrorCode: 35, ErrorMsg: [Cloudera][Hardy] (35) Error from server: error code: '1' error message: 'Error while compiling statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask'.

I want to use the result in WHERE clause. But I don't know the statement. SQL select * from HIVE.abc.`abc1' where ---- ;

Please help.

1

There are 1 answers

4
leftjoin On BEST ANSWER

If month and day are stored as integers, you need to use lpad() to add zero if it is single digit month or day. For example month 5 should become 05. Without this max may work incorrectly. Also use dash as a separator to have date in compatible format.

max(concat(year,'-',lpad(month, 2,0),'-',lpad(day, 2,0)))

And to use it in the WHERE use WHERE date in (select max ...):

SELECT * from your_table
WHERE concat(year,'-',lpad(month, 2,0),'-',lpad(day, 2,0)) in (select max(concat(year,'-',lpad(month, 2,0),'-',lpad(day, 2,0))) from your_table)

Also you may need to quote names like year, month, day in backticks everywhere in sql:

max(concat(`year`,'-',lpad(`month`, 2,0),'-',lpad(`day`, 2,0)))