How to emit tuples instead of a list of tuples

163 views Asked by At

I have a scalding job that looks like this:

import com.twitter.scalding.{Args, Csv, Job, TextLine}

class DataJob(args: Args) extends Job(args) {

  val input = args("input")
  val output = Csv(args("output"), separator = ",")

  def parseLine(x: String):Seq[(String, String, String, String)] = {
    List(("a", "b", "c", "d")) //Returns a list, not a tuple
  }

  TextLine(input).mapTo('line -> ('v1, 'v2, 'v3, 'v4)) {
  x:String => {
    parseLine(x) // this code fails with arity error
  }
  }.write(Csv(args("output")))
}

When it runs, I get the following error:

Caused by: java.lang.AssertionError: assertion failed: Arity of (class com.twitter.scalding.LowPriorityTupleSetters$$anon$2) is 1, which doesn't match: + ('v1', 'v2', 'v3', 'v4')

This is because my parseLine function returns a list of tuples but the code expects a single tuple to be emitted. How can I get this code to work?

1

There are 1 answers

1
JMM On

Ok, looks like I just needed to change:

TextLine(input).mapTo('line -> ('v1, 'v2, 'v3, 'v4))

to:

TextLine(input).flatMap('line -> ('v1, 'v2, 'v3, 'v4))

Still not exactly clear why, so any responses would be appreciated!