Lift algebird aggregator to consume (and return) Map

Question

Lift algebird aggregator to consume (and return) Map

202 views Asked by Todd Owen At 06 February 2017 at 01:36

The example in the README is very elegant:

scala> Map(1 -> Max(2)) + Map(1 -> Max(3)) + Map(2 -> Max(4))
res0: Map[Int,Max[Int]] = Map(2 -> Max(4), 1 -> Max(3))

Essentially the use of Map here is equivalent to SQL's group by.

But how do I do the same with an arbitrary Aggregator? For example, to achieve the same thing as the code above (but without the Max wrapper class):

scala> import com.twitter.algebird._
scala> val mx = Aggregator.max[Int]
mx: Aggregator[Int,Int,Int] = MaxAggregator(scala.math.Ordering$Int$@78c77)
scala> val mxOfMap = // what goes here?
mxOfMap: Aggregator[Map[Int,Int],Map[Int,Int],Map[Int,Int]] = ...
scala> mxOfMap.reduce(List(Map(1 -> 2), Map(1 -> 3), Map(2 -> 4)))
res0: Map[Int,Int] = Map(2 -> 4, 1 -> 3)

In other words, how to I convert (or "lift") an Aggregator that operates on values of type T into an Aggregator that operates on values of type Map[K,T] (for some arbitrary K)?

Original Q&A

There are 1 answers

**Todd Owen** · Answer 1 · 2017-02-06T02:50:04+00:00

Looks like this can be done fairly easily for Semigroup at least. This should be sufficient in the case where there is no additional logic in the "compose" or "present" phases of the aggregator which needs to be preserved (a Semigroup can be obtained from an Aggregator, discarding compose/prepare).

The code to answer the original question is:

scala> val sgOfMap = Semigroup.mapSemigroup[Int,Int](mx.semigroup)
scala> val mxOfMap = Aggregator.fromSemigroup(sgOfMap)
scala> mxOfMap.reduce(List(Map(1 -> 2), Map(1 -> 3), Map(2 -> 4)))
res0: Map[Int,Int] = Map(2 -> 4, 1 -> 3)

But in practice, it would be better to start by constructing the arbitrary Semigroup directly, rather than constructing an Aggregator merely to extract the semigroup from:

scala> import com.twitter.algebird._
scala> val mx = Semigroup.from { (x: Int, y: Int) => Math.max(x, y) }
scala> val mxOfMap = Semigroup.mapSemigroup[Int,Int](mx)
scala> mxOfMap.sumOption(List(Map(1 -> 2), Map(1 -> 3), Map(2 -> 4)))
res33: Option[Map[Int,Int]] = Some(Map(2 -> 4, 1 -> 3))

Alternatively, convert to aggregator: Aggregator.fromSemigroup(mxOfMap)

TechQA.

Lift algebird aggregator to consume (and return) Map

There are 1 answers

Related Questions in SCALA

Related Questions in ALGEBIRD

Popular Questions

Trending Questions