scala & Spark - ArrayBuffer does not append

1.7k views Asked by At

I am new to Scala and Apache Spark and have been trying out some online examples.

I am using scala.collection.mutable.ArrayBuffer to store a list of tuples of the form (Int,Array[String]). I am creating an ArrayBuffer and then parsing a text file line by line and appending the required data from each line to the ArrayBuffer.

The code has no compilation errors. But when I access ArrayBuffer outside the block where I am appending it, I am not able to get the contents and the ArrayBuffer is always empty.

My code is below -

val conf = new SparkConf().setAppName("second")
val spark = new SparkContext(conf)

val file = spark.textFile("\\Desktop\\demo.txt")
var list = scala.collection.mutable.ArrayBuffer[(Int, Array[String])]()
var count = 0

file.map(_.split(","))
.foreach { a =>
count = countByValue(a) // returns an Int
println("count is " + count) // showing correct output "count is 3"
var t = (count, a)
println("t is " + t) // showing correct output "t is (3,[Ljava.lang.String;@539f0af)"
list += t
}

println("list count is = " + list.length) // output "list count is = 0"
list.foreach(println) // no output

Can someone point out why this code isn't working.

Any help is greatly appreciated.

1

There are 1 answers

5
Gábor Bakos On BEST ANSWER

I assume spark is a SparkContext. In this case this is not surprising that the local list is not updated, only its copy sent to spark as a closure. In case you need a mutable value within the foreach, you should use an Accumulator.