I have a map-reduce job and the reducer gets an absolute address of a file residing on the Azure Blob storage and the reducer should opens it and read its content. I add the storage account containing the files when provisioning my Hadoop cluster (HDInsight). So the reducer must have access to this Blob storage but as the Blob Storage is not the default HDFS storage for my job. I have the following code in my reducer, but it gives me a FileNotFound error message.
FileSystem fs = FileSystem.get(new Configuration());
Path pt = new Path("wasb://mycontainer@accountname...");
FSDataInputStream stream = fs.open(pt);
It is covered in https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-use-blob-storage/#addressing
The syntax is wasb://[email protected]/example/jars/hadoop-mapreduce-examples.jar
If "mycontainer" is a private container, you must add "myaccount" azure storage account as an additional storage account during provision process.