I'm trying to setup Nutch 2.2.1 using HBase 0.94.14, on Debian Squeeze. I've followed Nutch 1 and 2 tutorials carefully and various documentations. I could build HBase 0.94.14, and eventually got it to work (I can create tables etc.) I could build Nutch without any issue (it's set on Gora 0.3)
Now issues are: 1- when trying to launch Nutch, I get the following trace:
./nutch inject /root/nutch/apache-nutch-2.2.1/urls/
InjectorJob: starting at 2014-11-27 09:43:53
InjectorJob: Injecting urlDir: /root/nutch/apache-nutch-2.2.1/urls
InjectorJob: java.lang.ClassNotFoundException: org.apache.gora.memory.store.HBaseStore
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
etc.
Using strace -f, I've figured out that "HBaseStore.class" was not found:
stat("/root/nutch/apache-nutch-2.2.1/runtime/local/org/apache/gora/memory/store/HBaseStore.class",\
<unfinished ...>
[pid 1827] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
I tried to figure out if there was an issue with classpath, but eventually found out that: - HBaseStore.class was present neither in the Nutch directory tree nor in the Hbase 0.94.4 directory tree - HBase jar version in the Nutch tree was surprinsingly: hbase-0.90.4.jar
According to some online discussions I found, I replace hbase-0.90.4.jar in the nutch tree with hbase-0.94.4 from the hbase tree...
But: - it doesn't fix the java issue - each time I'm rebuilding nutch, hbase-0.90.4.jar is back and I can't find any source for it in the nutch tree :-/
Note that /root/nutch/apache-nutch-2.2.1/conf/hbase-site.xml has:
<property>
<name>hbase.rootdir</name>
<value>/root/nutch/hbase-master/conf/</value>
</property>
which corresponds to Nutch 0.94.4 ...
Also tried to rebuild and use Gora 0.5 but it makes Nutch build fail.
I'm not an expert in Java at all, and I don't understand why Nutch is not using the correct version of HBase, why it seems there are missing sources and java classes, and at this point I'm totally stuck. What a mess.
Thanks for any tip that could help to save this situation.
Alfonso,
I checked about gora.properties, it was OK.
Also, I've tried the latest 2.3 Snapshot but unfortunately it ended into some dependency issue at build time: