reading XML with Cascalog/Cascading

204 views Asked by At

There is some info on the web indicating that Mahout's XMLInputFormat can be used to efficiently process XML on hadoop, but I've been unable to find an example of how to get this working. Can someone point me in the right direction?

I'm using Cascalog/Clojure.

1

There are 1 answers

1
Ashish On

Just have a look at this to read a xml file using hadoop implementation of record reader:

http://javatute.com/javatute/faces/post/hadoop/2014/reading-simple-xml-file-using-hadoop.xhtml