Copy Files from AWS S3 to HDFS (Hadoop Distributed File System)

625 views Asked by At

I'm trying to copy AVRO files from AWS S3 bucket to HDFS using the following Scala code:

val avroDF  = sparkSession.read.format("com.databricks.spark.avro").load("s3a://"+s3Location+"/")
avroDF.write.format("com.databricks.spark.avro").mode(SaveMode.Append).save(filePath)

The files when being copied to HDFS, part files are getting saved like (part-0001.avro), how to save the file with the same file name as it exists in AWS S3 bucket?

0

There are 0 answers