I am trying to pull data from a Titan-Cassandra graph database and write it to a single Hadoop node using Faunus. The Hadoop node is running on a remote machine. So, the machine on which Faunus is running acts like a source from which data is streamed and this has to be written to a remote single Hadoop node.
Inside titan-cassandra-input.properties, I specify that the output be written to remote HDFS by specifying the output location:
faunus.output.location=hdfs://10.143.57.157:9000/tmp/foutput
I changed the Hadoop configs:
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://10.143.57.244:9000/</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>10.143.57.244:9001</value>
</property>
</configuration>
I have added the source IP to /etc/hosts
10.143.57.244 hadoop2
But when I try to start Hadoop with ./start-all.sh
, I see that the NameNode is not getting started. When I see the NameNode logs, I see the following error:
ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException:
Problem binding to master/10.143.57.244:9000 : Cannot assign requested address
I am not able to make out why it is trying to bind to the source IP. Is it treating the source IP as another node in the Hadoop cluster?
I do not want to setup a cluster. I just want the Hadoop node to listen to any connections from the source IP. How do I configure this? Please help.