I'm trying to split the below RDD row into five columns
val test = [hello,one,,,]
val rddTest = test.rdd
val Content = rddTest.map(_.toString().replace("[", "").replace("]", ""))
.map(_.split(","))
.map(e ⇒ Row(e(0), e(1), e(2), e(3), e(4), e(5)))
when I execute I get "java.lang.ArrayIndexOutOfBoundsException" as there are no values between the last three commas.
any ideas on how to split the data now?
Your code is correct, but after splitting you are trying to access 6 elements instead of 5.
Change
to
UPDATE
By default, empty values are omitted when we do string split. That is the reason why your array has only 2 elements. To achieve what you intend to do, try this:
observe the split function, using it that way will make sure all the fields are retained.