Replace Digits Using SubString and lpad/rpad In Hive

171 views Asked by At

Hellow Everyone,

Someone Kinldy help me to Implement Below logic In Hive.I have 2 Table in hive(table1,table2). I need to replace 0's by 9 from One column of table1 with some conditions, and need to generate one output column then same 'output' column I will join from table2 column(High Column) and will produce the output from both the tables.

table1

SCHEME_1       LOW_1
01            12340000
01            12345000
01            12300000

table2

SCHEME            HIGH
01               12349999
01               12345999
01               12399999

Here From Table1 I need to Generate new Column output with below Conditions.

scenario:

 1. I need to check 0's from right to left and as soon as i find 0 before any digit then i need to replace all the trailing 0's by 9.
 2. In output at-least 4  digit should be there before 9.
 3. In Input if 4 or less digits are available in input then I need to skip some 0's and make sure that at-least 4 digits are there before 9.
 4. If more than 4 digits are available before trailing 0's then only need to replace 0,No need to replace digits.

LOW_1        output
12340000     12349999
12345000     12345999
12300000     12309999

Then Need to Join this 'output' column with table2 HIGH then want to produce the data.

expected output

SCHEME            LOW_1      output      HIGH
01              12340000      12349999   12349999
01              12345000      12345999   12345999

I am Using below query and want to write output column logic so that it can join with the table2.

with table1 as
(
select LOW_1,SCHEME_1 from table1 where SCHEME_1='01'
)
select table2.*,
SCHEME_1,
LOW_1
from table2 inner join table1
on
(
table2.high=output
);

below is the logic for the output column which I want to add in abobe query.

with table1 as (
 select LOW_1 
 ) select LOW_1,
 lpad(concat(splitted[0], translate(splitted[1],'0','9')),8,0) as output 
 from ( 
 select LOW_1, split(regexp_replace(LOW_1,'(\\d{*?}?)(0+)$','$1|$2'),'\\|') splitted from table1 )s;


 Can Someone Suggest me to Implement the same.
1

There are 1 answers

0
Sonu On

It Worked Now. I Updated My Query as Below.

with table1 as
(
select LOW_1,SCHEME_1,
lpad(
concat(
split(regexp_replace('(\\d{4,}?)(0+)$','$1|$2'),'\\|')[0], 
translate(split(regexp_replace(LOW_1,'(\\d{4,}?)(0+)$','$1|$2'),'\\|')[1],'0','9')),8,0 ) 
as output
 from table1 where SCHEME_1='01'
)
select table2.*,
SCHEME_1,
LOW_1
from table2 inner join table1
on
(
table2.high=output
);