I want to get 2 rdd from Cassandra,then join them.And I want to skip the empty value.
def extractPair(rdd: RDD[CassandraRow]) = {
rdd.map((row: CassandraRow) => {
val name = row.getName("name")
if (name == "")
None //join wrong
else
(name, row.getUUID("object"))
})
}
val rdd1 = extractPair(cassRdd1)
val rdd2 = extractPair(cassRdd2)
val joinRdd = rdd1.join(rdd2) //"None" join wrong
use flatMap can fix this,but i want to know how to use map fix this
def extractPair(rdd: RDD[CassandraRow]) = {
rdd.flatMap((row: CassandraRow) => {
val name = row.getName("name")
if (name == "")
seq()
else
Seq((name, row.getUUID("object")))
})
}
This isn't possible with just a
map. You would need to follow it up with afilter. But you would still be best to wrap the valid result in aSome. But, then you would still have it wrapped in a Some as a result...requiring a secondmapto unwrap it. So, realistically, your best option is something like this:Optionis implicitly convertable to a flattenable type and conveys your methods message better.