I want to convert the following shown pipeline RDD into a parallel RDD of the form mention below
Here's the schema:
schemaString = "S" fields = [StructField(schemaString + str(i), FloatType(), True) for i in range(164)] schema = StructType(fields)
so that it can be finally convert it into a data frame with 164 columns.
I am finding difficulty in clubbing every 164 numbers together.
dec_RDD.take(4) [(120,), (-119,), (-125,), (-119,)] new_dec_RDD should be like this [(120,-119,-125,-119,..........164 row elements), ( 164 row elements ),]