how to connect to multiple hdfs clusters from external application

480 views Asked by At

i am trying to connect to multiple hdfs clusters from external application running on kubernetes to acess hdfs data across systems. I am able to connect to one hdfs cluster by copying the krb5.conf and hive-site.xml,hdfs-site.xml and other config files.

core-site.xml
    <!--Autogenerated by Cloudera Manager-->
    <configuration>
      <property>
        <name>fs.defaultFS</name>
        <value>hdfs://cluster1</value>
      </property>
      <property>
        <name>fs.trash.interval</name>
        <value>1</value>
      </property>

**hdfs-site.xml**

    ated by Cloudera Manager-->
    <configuration>
      <property>
        <name>dfs.nameservices</name>
        <value>cluster1</value>
      </property>
      <property>

Now i can connect to this cluster and read the hdfs file

val dfCluster1 = spark.read.format("avro").load("/cluster1/folder1");

Now i want to connect to the second hdfs cluster and read the contents. I have the hdfs-site.xml and core-site.xml from the second cluster but how do i make spark understand that i need to connect to the second cluster since i can have only one hive-site.xml and hdfs-site.xml in the classpath.

0

There are 0 answers