How can I sort elements of a TypedPipe in Scalding?

165 views Asked by At

I have not been able to find a way to sort elements of a TypedPipe in Scalding (when not performing a group operation). Here are the relevant parts of my program (replacing irrelevant parts with ellipses):

  case class ReduceOutput(val slug : String,  score : Int, json1 : String, json2 : String)

  val pipe1 : TypedPipe[(String, ReduceFeatures)] = ...
  val pipe2 : TypedPipe[(String, ReduceFeatures)] = ...    
  pipe1.join(pipe2).map { entry =>
    val (slug : String, (features1 : ReduceFeatures, features2 : ReduceFeatures)) = entry
    new ReduceOutput(
      slug,
      computeScore(features1, features2),
      features1.json,
      features2.json)
  }
    .write(TypedTsv[ReduceOutput](args("output")))

Is there a way to sort the elements on their score after the map but before the write?

0

There are 0 answers