I want to get 2 rdd from Cassandra,then join them.And I want to skip the empty value.
def extractPair(rdd: RDD[CassandraRow]) = {
rdd.map((row: CassandraRow) => {
val name = row.getName("name")
if (name == "")
None //join wrong
else
(name, row.getUUID("object"))
})
}
val rdd1 = extractPair(cassRdd1)
val rdd2 = extractPair(cassRdd2)
val joinRdd = rdd1.join(rdd2) //"None" join wrong
use flatMap can fix this,but i want to know how to use map fix this
def extractPair(rdd: RDD[CassandraRow]) = {
rdd.flatMap((row: CassandraRow) => {
val name = row.getName("name")
if (name == "")
seq()
else
Seq((name, row.getUUID("object")))
})
}
This isn't possible with just a
map
. You would need to follow it up with afilter
. But you would still be best to wrap the valid result in aSome
. But, then you would still have it wrapped in a Some as a result...requiring a secondmap
to unwrap it. So, realistically, your best option is something like this:Option
is implicitly convertable to a flattenable type and conveys your methods message better.