Iterate through mixed-Type scala Lists

Question

Iterate through mixed-Type scala Lists

508 views Asked by QuakerMan One At 18 May 2017 at 18:10

Using Spark 2.1.1., I have an N-row csv as 'fileInput'

colname datatype    elems   start   end
colA    float       10      0       1
colB    int         10      0       9

I have successfully made an array of sql.rows ...

val df = spark.read.format("com.databricks.spark.csv").option("header", "true").load(fileInput)
val rowCnt:Int = df.count.toInt
val aryToUse  = df.take(rowCnt)
Array[org.apache.spark.sql.Row] = Array([colA,float,10,0,1], [colB,int,10,0,9])

Against those Rows and using my random-value-generator scripts, I have successfully populated an empty ListBuffer[Any] ...

res170: scala.collection.mutable.ListBuffer[Any] = ListBuffer(List(0.24455154, 0.108798146, 0.111522496, 0.44311434, 0.13506883, 0.0655781, 0.8273762, 0.49718297, 0.5322746, 0.8416396), List(1, 9, 3, 4, 2, 3, 8, 7, 4, 6))

Now, I have a mixed-type ListBuffer[Any] with different typed lists. . How do iterate through and zip these? [Any] seems to defy mapping/zipping. I need to take N lists generated by the inputFile's definitions, then save them to a csv file. Final output should be:

ColA, ColB
0.24455154, 1
0.108798146, 9
0.111522496, 3
... etc

The inputFile can then be used to create any number of 'colnames's, of any 'datatype' (I have scripts for that), of each type appearing 1::n times, of any number of rows (defined as 'elems'). My random-generating scripts customize the values per 'start' & 'end', but these columns are not relevant for this question).

Original Q&A

There are 2 answers

**Haroun Mohammedi** · Answer 1 · 2017-05-18T18:20:10+00:00

Haroun Mohammedi On 18 May 2017 at 18:20

I think the RDD.zipWithUniqueId() or RDD.zipWithIndex() methods can perform what you wanna do.

Please refer to official documentation for more information. hope this help you

**Tzach Zohar** · Answer 2 · 2017-05-18T18:45:58+00:00

Given a List[List[Any]], you can "zip" all these lists together using transpose, if you don't mind the result being a list-of-lists instead of a list of Tuples:

val result: Seq[List[Any]] = list.transpose

If you then want to write this into a CSV, you can start by mapping each "row" into a comma-separated String:

val rows: Seq[String] = result.map(_.mkString(","))

(note: I'm ignoring the Apache Spark part, which seems completely irrelevant to this question... the "metadata" is loaded via Spark, but then it's collected into an Array so it becomes irrelevant)

TechQA.

Iterate through mixed-Type scala Lists

There are 2 answers

Related Questions in SCALA

Related Questions in LIST

Related Questions in CSV

Related Questions in APACHE-SPARK

Related Questions in LISTBUFFER

Popular Questions

Popular Tags

Trending Questions