how to use rcfilepigstorage in pig

334 views Asked by At

I want to load a text file into pig and then store it as rc file for this I found that twitter has provided a storage udf in this link

http://grepcode.com/file/repo1.maven.org/maven2/com.twitter.elephantbird/elephant-bird-rcfile/3.0.8/com/twitter/elephantbird/pig/store/RCFilePigStorage.java

Can someone tell me how to compile it and use it in my pig code?

1

There are 1 answers

2
Prabha Satya On BEST ANSWER

Include all the twitter dependencies and the pig jars and compile the RCFilePigStorage.java. If you want to change some specific behavior in the code, do the changes also and can rename it to MyRCFilePigStorage.java.

Now take the class files generated after compiling and create a jar file named MyRCUdf.jar. Register this jar in your pigscript.

Register MyRCUdf.jar;
* your pig logic*
Store 'data' using MyRCFilePigStorage();

EDIT:Consider the following links for twitter dependencies. Take the source code, compile and include the classes generated in your classpath

https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/MapReduceInputFormatWrapper.java

https://github.com/kevinweil/elephant-bird