Hi i am trying to do a mapside join in crunch using MapsideJoinStrategy class. It is working fine for inner join but it gives this error for full outer join :" Join type FULL_OUTER_JOIN not supported by MapsideJoinStrategy"

1

There are 1 answers

0
tworec On

MapsideJoinStrategy can not perform RIGHT_OUTER_JOIN and so FULL_OUTER_JOIN. It is impossible by design. Whole work happens in mappers (no reduce phase). Since there can be more than one mapper it is not possible to determine which key from right-side will not have matching key on left-side, because single mapper will not see whole left-side data.

For FULL_OUTER_JOIN use DefaultJoinStrategy.

I've extended BloomFilterJoinStrategy to suport all join types. Here is pull request @ GitHub.