In Pig latin, am not able to load data as multiple tuples, please advice

120 views Asked by At

I am not able load the data as multiple tuples, am not sure what mistake am doing, please advise.

data.txt
vineet  1   pass    Govt
hisham  2   pass    Prvt
raj 3   fail    Prvt

I want to load them as 2 touples.

A = LOAD 'data.txt' USING PigStorage('\t') AS (T1:tuple(name:bytearray, no:int), T2:tuple(result:chararray, school:chararray));

OR

A = LOAD 'data.txt' USING PigStorage('\t') AS (T1:(name:bytearray, no:int), T2:(result:chararray, school:chararray));

dump A; the below data is displayed in the form of new line, i dont know why am not able to read actual data from data.txt.

(,)
(,)
(,)
1

There are 1 answers

0
Murali Rao On

As the input data is not stored as tuple we wont be able to read it directly in to a tuple.

One feasible approach is to read the data and then form a tuple with required fields.

Pig Script :

A = LOAD 'a.csv' USING PigStorage('\t') AS (name:chararray,no:int,result:chararray,school:chararray);
B = FOREACH A GENERATE (name,no) AS T1:tuple(name:chararray, no:int), (result,school) AS T2:tuple(result:chararray, school:chararray);
DUMP B;

Input : a.csv

vineet  1   pass    Govt
hisham  2   pass    Prvt
raj 3   fail    Prvt

Output : DUMP B:

((vineet,1),(pass,Govt))
((hisham,2),(pass,Prvt))
((raj,3),(fail,Prvt))

Output : DESCRIBE B :

B: {T1: (name: chararray,no: int),T2: (result: chararray,school: chararray)}