How I can upload a file into HDFS using WebHDFS REST API

312 views Asked by At

I want to upload a file from local server to to HDFS via webHDFS REST API. Based on the documentation, this operation take two steps:

  1. Submit a HTTP PUT request, that return the location

    HttpResponseProxy{HTTP/1.1 307 Temporary Redirect [Date: Tue, 06 Jun 2023 10:14:46 GMT, Cache-Control: no-cache, Expires: Tue, 06 Jun 2023 10:14:46 GMT, Date: Tue, 06 Jun 2023 10:14:46 GMT, Pragma: no-cache, X-Content-Type-Options: nosniff, X-FRAME-OPTIONS: SAMEORIGIN, X-XSS-Protection: 1; mode=block, Location: http://fcb6de72d72d:9864/webhdfs/v1/root/EMSI/file.txt?op=CREATE&namenoderpcaddress=namenode:9000&createflag=&createparent=true&overwrite=false, Content-Type: application/octet-stream, Content-Length: 0] [Content-Type: application/octet-stream,Content-Length: 0,Chunked: false]}

  2. Submit another HTTP PUT request using the URL in the Location header

curl -i -X PUT -T file.txt "http://fcb6de72d72d:9864/webhdfs/v1/root/EMSI/file.txt?op=CREATE&namenoderpcaddress=namenode:9000&createflag=&createparent=true&overwrite=false"

This is what I get: curl: (6) Could not resolve host: fcb6de72d72d

I'm running Hadoop cluster on Docker in Macbook Air M1.

References: Issues with Uploading an image to HDFS via webHDFS REST API https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Create_and_Write_to_a_File

Thank you for your help,

1

There are 1 answers

4
OneCricketeer On

The location returned by first request is using the container id, which cannot be resolved by your host. Replace it with localhost

Alternatively, use hadoop fs -put instead of WebHDFS