I've a specific scenario of creating a file extract dat/delimited in scala/spark; just need some suggestions on an alternate approach.

Header & Trailer creation:

val header = Seq(filename,system_time)
Seq(header).toDS.write.text(s"/path/to/header/creation/dir")
val trailer = Seq(rowscount,filename)
Seq(header).toDS.write.text(s"/path/to/trailer/creation/dir")

I've extract from hive table in a dataframe:

val df = sql("select * from hive")

The dataframe has different schema than header/trailer. At present I just do a merge of header-df-trailer and create a final file.

My query is, don't we have any option of creating the final file altogether without having to store them separately & merging?

0 Answers