Uploading a local file to a remote hdfs with Java API but connect to localhost

868 views Asked by At

I have this very simple upload method to upload a file to a one-node hdp2.5 cluster:

Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(new URI("webhdfs://hdsfhost:50070", conf);
fs.copyFromLocalFile(false, true, new Path(localFilePath), new Path(hdfsPath));

Tracing what happens the flow starts correctly:

  • connect to hdfshost:50070,
  • check if file already exists (no),
  • connect to datanode.

That is where it fails: the datanode is found to be localhost:50075 instead of hdfshost:50075, resulting in a "java.net.ConnectException: Connection refused".

I have the following relevant settings on hdp:

  • dfs.client.use.datanode.hostname => true
  • dfs.datanode.http.address => 0.0.0.0:50075
  • dfs.namenode.http-address => 0.0.0.0:50070

I could not find any reason why localhost is used instead of hdfshost (and there is no override in /etc/hosts, neither on the local machine neither on the cluster). Any help would be very appreciated.

1

There are 1 answers

0
Nico On BEST ANSWER

You need to change your configuration of the http-address to your local IP address instead of 0.0.0.0. 0.0.0.0 gets resolved to localhost and will then be used by dfs.client.use.datanode.hostname => true while your local IP address will be resolved to the DNS name and then be used by hostname again.

Since it works I will post this as an answer, thus I don't know if my reasoning for the solution is correct. If anybody knows the exact reason please add it as a comment or edit my answer.