Scala - Create Dataframe with only 1 row from a List using for comprehension

109 views Asked by At

For some weird reasons I need to get the column names of a dataframe and insert it as the first row(I cannot just import without header). I tried using for comprehension to create a dataframe that only has 1 row and 30 columns(there are 30 headers) and union it to the original dataframe. But what I got is a dataframe with 1 row and only 1 column, with the value being a list of 30 strings.

What I tried:

val headerDF = Seq((for (col <- data.columns) yield col)).toDF
display(headerDF)
Column A
["col1", "col2", "col3", ...]

Expected Behavior:

Column A Column B Column B
col1 col2 Col3
1

There are 1 answers

0
Oli On

One solution is to use spark.range(1) to create a one-row dataframe and then create one column per column name like this:

// a random dataframe with 4 columns
val df = Seq(("a", "b", "c", "d")).toDF("A", "B", "C", "D")
df.show
+---+---+---+---+
|  A|  B|  C|  D|
+---+---+---+---+
|  a|  b|  c|  d|
+---+---+---+---+
val header = spark.range(1).select(df.columns.map(c => lit(c) as c) : _*)
df.union(header).show
+---+---+---+---+
|  A|  B|  C|  D|
+---+---+---+---+
|  a|  b|  c|  d|
|  A|  B|  C|  D|
+---+---+---+---+