Pig - Python UDF issue

186 views Asked by At

I am trying to load a .mmdb file in Pig to then pass it into a python script. However I get the error message: "Invalid scalar projection: db: A column needs to be projected from a relation for it to be used as a scalar". My code is:

REGISTER 'py_pigscript.py' USING jython AS myudf;
log = LOAD 'test.txt' USING PigStorage(',') AS (x:int);
db = LOAD 'data.mmdb';
result = FOREACH log GENERATE myudf.function(x,db);

Any help would be appreciated. Thank you!

-edit:

The goal of this script is to extract a value from each row in 'test.txt' and pass them to 'data.mmdb' to return additional data.

1

There are 1 answers

0
Dan M On BEST ANSWER

A similar issue is discussed here [1]. In the context of your question, the code would look like:

log = LOAD 'test.txt' USING PigStorage(',') AS (x:int);
db = LOAD 'data.mmdb' AS (entry:(field_1:chararray, field_2....));
result = FOREACH log GENERATE myudf.function(x, db.entry);

[1] Pig pass relation as argument to UDF