FileUtils write method does not work on Azure Databricks

Question

FileUtils write method does not work on Azure Databricks

603 views Asked by Karzyfox At 01 December 2021 at 15:36

I have troubles writing a file on my Databricks cluster's driver (as a temp file). I have a scala notebook on my company's Azure Databricks which contains those lines of code :

val xml: String = Controller.requestTo(url)
val bytes: Array[Byte] = xml.getBytes

val path: String = "dbfs:/data.xml"
val file: File = new File(path)
FileUtils.writeByteArrayToFile(file, bytes)

dbutils.fs.ls("dbfs:/")

val df = spark.read.format("com.databricks.spark.xml")
                   .option("rowTag", "generic:Obs")
                   .load(path)

df.show

file.delete()

however it crashes with org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: dbfs:/data.xml. When I run a ls on the root of the dbfs, it doesn't show the file data.xml so for me FileUtils is not doing it's job. What puts me even more in troubles is that the following code works when running it on the same cluster, same Azure resource group, same instance of Databricks, but in another notebook :

val path: String = "mf-data.grib"
val file: File = new File(path)
FileUtils.writeByteArrayToFile(file, bytes)

I tried to restart the cluster, remove "dbfs:/" from the path, put the file in the dbfs:/tmp/ directory, use FileUtils.writeStringToFile(file, xml, StandardCharsets.UTF_8) instead of FileUtils.writeByteArrayToFile but none of those solutions has worked, even when combining them.

Original Q&A

There are 2 answers

**Alex Ott** · Answer 1 · 2021-12-01T16:27:58+00:00

Alex Ott On 01 December 2021 at 16:27

If you're using local APIs, like, File, you need to use corresponding local file access - instead of using dbfs:/ you need to prefix path with /dbfs/, so your code will look as following:

val file: File = new File(path.replaceFirst("dbfs:", "/dbfs")

**Karthikeyan Rasipalay Durairaj** · Answer 2 · 2021-12-01T16:35:02+00:00

Karthikeyan Rasipalay Durairaj On 01 December 2021 at 16:35

Try to remove the dbfs here : val path: String = "dbfs:/data.xml" for understanding purposes I have given 3 different magical command cells %sh , %fs, %scala . You can ref : here

TechQA.

FileUtils write method does not work on Azure Databricks

There are 2 answers

Related Questions in AZURE

Related Questions in SCALA

Related Questions in DATABRICKS

Related Questions in AZURE-DATABRICKS

Related Questions in FILEUTILS

Popular Questions

Trending Questions