Omit input data of map function in Scala

Question

Omit input data of map function in Scala

441 views Asked by Shen Li At 09 June 2015 at 16:49

I am learning Spark source code, and get confused on the following code:

/**
 * Return a new RDD containing the distinct elements in this RDD.
 */
def distinct(numPartitions: Int)(implicit ord: Ordering[T] = null): RDD[T] =
  map(x => (x, null)).reduceByKey((x, y) => x, numPartitions).map(_._1)

What is the input data for the map(x => (x, null)) function? Why and when the input can be omitted?

UPDATE:

Here is the link to the source code.

Original Q&A

There are 2 answers

Daenyth On 09 June 2015 at 17:00

the map function map(x => (x, null)) is the map defined by the class

I don't understand your question about omitting the input. You can't call a function in scala that expects an argument without giving it the argument.

**DNA** · Accepted Answer · 2015-06-09T17:05:30+00:00

distinct and map are both methods on the RDD class (source), so distinct is just calling another method on the same RDD.

The map function is a higher-order function - i.e. it accepts a function as one of its parameters (f: T => U)

/**
 * Return a new RDD by applying a function to all elements of this RDD.
 */
def map[U: ClassTag](f: T => U): RDD[U] = withScope {
  val cleanF = sc.clean(f)
  new MapPartitionsRDD[U, T](this, (context, pid, iter) => iter.map(cleanF))
}

In the case of distinct, the parameter f to map is the anonymous function x => (x, null).

Here's a simple example of using an anonymous function (lambda), in the Scala REPL (using the similar map function on a Scala list, not a Spark RDD):

scala> List(1,2,3).map(x => x + 1)
res0: List[Int] = List(2, 3, 4)

TechQA.

Omit input data of map function in Scala

There are 2 answers

Related Questions in SCALA

Related Questions in APACHE-SPARK

Related Questions in SCALA-COLLECTIONS

Related Questions in SCALA-2.10

Popular Questions

Popular Tags

Trending Questions