Is there a way to mimic R's higher order (binary) function shorthand syntax within spark or pyspark?

Question

128 views Asked by JasonAizkalns At 11 June 2015 at 15:18

In R, I can write the following:

## Explicit
Reduce(function(x,y) x*y, c(1, 2, 3))
# returns 6

However, I can also do this less explicitly with the following:

## Less explicit
Reduce(`*`, c(1, 2, 3))
# also returns 6

In pyspark, I could do the following:

rdd = sc.parallelize([1, 2, 3])
rdd.reduce(lambda a, b: a * b)

Question: Can you mimic the "shorthand" (less explicit) syntax of R's Reduce('*', ...) with pyspark or some sort of anonymous function?

There are 1 answers

**Nick Kennedy** · Accepted Answer · 2015-06-14T13:27:06+00:00

Nick Kennedy On 14 June 2015 at 13:27 BEST ANSWER

In R, you're supplying a binary function. The multiply operator (as with all operators) is actually a binary function. Type

`*`(2, 3)

to see what I mean.

In Python, the equivalent for multiplication is operator.mul.

So:

rdd = sc.parallelize([1, 2, 3])
rdd.reduce(operator.mul)